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Summary 

Overview 

This  document  is  a  preliminary  report  on  one  aspect  of  the 
initial  phase  of  a  proposed  three-year  research  program  on  distributed 
data  management.   The  work  dealt  with  here  is  the  development  of  models 
for  data  distribution.   These  models  consist  of  equations  for  system 
cost,  availability,  and  response  time  in  terms  of  appropriate  parameters 
describing  system  behavior,  usage  patterns,  etc.   This  interim  report 
deals  with  models  which  look  at  the  system  from  a  very  high  level.   Low- 
level  features  -  strategies,  policies,  etc.  -  will  be  built  in  later  so 
that  their  effects  on  cost,  response,  and  availability  can  be  assessed. 
Cost  Model 

Besides  providing  a  tool  for  further  research,  the  modeling 
effort  has  yielded  some  immediate  insights  into  the  advantages  -  and 
problems  -  of  distributed  data  management.   We  have  found,  for  example, 
that  data  distribution  can  be  cost  effective  -  in  the  sense  that  it  may 
be  actually  cheaper  to  store  at  a  remote  site  -  for  reasonable  parameter 
values  and  a  not  excessive  cost  differential  between  sites.   In  addition, 
it  appears  that  this  result  is  fairly  insensitive  to  the  size  of  the 
data  set  to  be  transferred,  but  it  does  require  that  the  data  be  compressed 
for  shipment. 
Availability  Model 

Perhaps  the  most  interesting  result  of  the  availability  study 
is  the  following.   If  there  are  two  copies  of  the  data  base  (located  at 
different  sites)  and  both  are  kept  immediately  accessible  and  as  up  to 
date  as  possible,  at  least  one  copy  of  the  data  base  is  available  more 
than  99  percent  of  the  time.   (This  result  does  not  take  into  account 
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scheduled  down-time  or  the  (small)  possibility  that  both  sites  are  down 
concurrently.)   If  the  remote  copy  is  not  a  true  "running  spare"  -  i.e., 
is  an  inactive  backup  stored,  say,  on  tape  -  the  improvement  in  avail- 
ability seems  hardly  enough  to  make  such  backup  worthwhile.   This  result 
serves  to  emphasize  the  importance  of  developing  techniques  for  on-line 
data  base  synchronization. 
Response  Model 

The  study  of  response  time  has  led  to  some  simple  relations 
which  should  be  useful  in  algorithms  for  determining  when  a  site  should 
share  its  query  load  with  other  sites  holding  a  copy  of  the  data  base. 
Rather  than  being  amenable  to  a  priori  study,  the  parameters  appearing 
in  this  model  are  envisioned  as  being  provided  by  real  system  monitoring 
and  measurement,  so  that  they  are  appropriate  for  decision  making  in  a 
dynamic  environment. 
Report  Format 

In  the  next  section  we  discuss  the  goals  of  the  modeling 
program,  both  in  the  long  term  (as  a  research  tool)  and  in  the  short 
term  (i.e.,  for  the  work  presented  here).   Following  this,  we  briefly 
review  work  reported  in  the  literature  which  seems  pertinent  to  our 
effort.   The  major  part  of  the  document  is  then  devoted  to  detailed 
reports  on  the  three  models:   cost,  availability,  and  response  time,  in 
that  order. 
Report  Validity 

The  reader  should  note  that  the  results  given  in  this  report 
are  to  be  considered  tentative.   The  models  are  in  the  process  of  revision 
and  refinement,  as  well  as  more  thorough  testing.   This  is  only  a  prelimi- 
nary report;  conclusions  reached  from  the  models  in  their  present  state 
should  not  be  relied  upon  or  widely  disseminated. 
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Goals  of  the  Modeling  Program 

Long-Term  Goals 

Developing  models  to  describe  the  various  aspects  of  distributed 
data  management  is  an  integral  part  of  our  research  program.   Model 
building  -  the  development  of  equations  to  describe  system  behavior, 
costs,  etc.  -  is  an  essential  tool  in  computer  science  research.   Using 
a  good  model,  one  can  study  alternative  design  options,  compare  decision 
strategies,  etc. 

It  is  important  that  the  model  effectively  reflect  the  real 
world  that  is  to  be  studied.   For  this  reason,  we  plan  to  build  a  model 
which  is  highly  modular  and  highly  parameterized.   The  modularity  will 
provide  flexibility  and  allow  us  to  study  some  aspects  of  the  problem 
independently  of  having  a  detailed  model  of  a  whole  system.   For  example, 
network  problems  of  synchronization  and  deadlock  can  probably  be  studied 
through  a  high-level  networking  model  which  does  not  concern  itself  with 
details  of  data  management.   The  parametrization  will  allow  us  to  put 
into  a  high-level  model  guessed  values  for  the  effects  of  lower-level 
systems.   In  this  way  we  can  generate  some  insights  into  what  is  going 
on  before  building  a  complete  model.   Parametrization  also  should  allow 
us  to  mimic  real  PWIN  system  behavior  as  closely  as  available  measurement 
data  will  allow.   This  will  maximize  the  PWIN-relevance  of  our  research 
results . 

The  actual  research  areas  which  we  plan  to  study  in  part 
through  modeling  have  been  described  in  some  detail  in  our  Research  Plan 
(CAC  Doc.  No.  164,  JTSA  Doc.  No.  5510).   In  the  interests  of  brevity,  we 
will  not  repeat  that  information  here. 
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Goals  of  this  Preliminary  Study 

In  this  preliminary  work,  we  have  been  limited  by  time  constraints 
to  the  construction  of  fairly  superficial  models  and  to  the  identification 
of  some  promising  directions  for  further  work.   In  order  to  get  a  good 
feel  for  the  broad  range  of  model  components  that  may  be  useful  to  us, 
we  planned  a  three-pronged  effort,  directed  towards  assessing  the  gross 
effects  of  data  distribution  on  costs,  availability,  and  reponse  time. 
Our  approach  has  been  to  survey  the  modeling  literature  for  work  which 
seemed  relevant  to  our  program  and  to  begin  to  extend  such  work  to 
study  the  problems  of  distributed  data  management. 

In  order  to  develop  a  cost  model  for  network  data  distribution, 
we  have  begun  with  a  cost  model  of  a  hierarchical  storage  system 
and  extended  it  by  including  storage  at  a  remote  site  as  part  of  the  hierarchy, 
We  have  then  attempted  to  identify  the  major  cost  components  which  the 
network  introduces  into  total  cost.   By  carefully  including  these 
network  components,  we  hope  to  have  designed  a  basic  model  which  we 
may  then  expand  on  (by  providing  a  greater  level  of  detail)  to  study 
such  things  as  the  cost  overhead  of  various  synchronization  strategies. 

To  study  availability,  we  have  taken  a  similar  approach. 
We  have  begun  with  a  model  for  single-site  data  base  recovery  strategies 
and  tried  to  see  how  the  assumptions  and  results  of  that  model  are  affected 
by  locating  the  backup  copy  at  a  remote  site. 

The  complexity  of  response-time  models  (involving,  as  they 
usually  do,  heavy  usage  of  stochastic  analysis  and  queueing  theory) 
precluded  our  getting  very  deeply  into  this  area  in  the  short  term. 
Instead,  we  undertook  a  rather  superficial  investigation  into  the  conditions 
under  which  mutiple  data  copies  and  query  distribution  may  lead  to  an 
improvement  in  system  responsiveness. 


In  summary,  the  primary  goal  of  this  preliminary  effort  has 
been  to  gain  an  understanding  of  the  components  needed  to  model  the 
major  features  of  a  distributed  data  management  system.   But  we  feel 
that  the  models  developed,  although  crude,  do  form  a  solid  basis  for 
future  work  and  have  already  provided  us  with  some  insights  into  the 
value  of  data  distribution. 


Models  in  the  Literature 

Introduction 

The  first  step  in  our  modeling  program  has  been  to  look  closely 
at  relevant  models  reported  in  the  literature.   Actually,  modeling  is  an 
extensively  used  tool  in  computer  science,  and  models  of  one  sort  or 
another  are  found  throughout  the  literature.   For  example,  models  are 
used  for  comparisons  and  evaluations  of  alternative  system  designs. 
Such  mathematical  design  analysis  is  much  less  costly  and  time-consuming 
than  actually  building  alternative  systems  and  trying  them  out.   Models 
are  also  heavily  used  in  optimization  studies.   For  example,  optimal  (or 
near  optimal)  file  allocations  can  be  derived  from  rather  simple  formulas 
for  cost  and  response  time.   The  simplicity  of  the  formulas,  it  should 
be  noted,  is  not  a  drawback  of  the  model  but  an  advantage.   Working  with 
models  instead  of  the  complex  real  world  allows  one  to  focus  attention 
on  only  those  features  that  one  wishes  to  study. 

The  models  that  we  review  in  this  section  are  therefore  generally 
not  complex  and  all-encompassing,  but  are  simple  formulas  which  seem  to 
have  some  relevance  to  problems  of  interest  to  us.   The  discussion  is 
organized  primarily  according  to  the  output  of  the  model,  and  only 
secondarily  according  to  its  context  or  application.   First,  we  consider 
response-time  models  and,  under  this  heading,  other  types  of  models 
dealing  with  time  delays  (e.g.  in  a  network)  or  the  time  it  takes  a 
process  to  be  carried  out.   Thus  we  have  collected  together  various 
models  which  may  have  some  relevance  to  the  overall  response  time  of  a 
distributed  data  base.   Second,  we  consider  very  briefly  the  closely 
related  concept  of  throughput,  and  techniques  for  modeling  it. 


Third,  we  review  various  models  which  may  be  useful  to  the 
study  of  data  availability  -  essentially  the  probability  that  the  data 
base  is  accessible  when  needed.   Overall  availability  may  involve  such 
factors  as  network  reliability,  system  failures,  and  recovery  strategies  - 
all  of  which  have  been  individually  modeled  in  some  context.   Finally, 
we  look  briefly  at  cost  models  -  valuable  for  their  ability  to  encompass 
all  kinds  of  resource  utilization,  but  not  nearly  so  extensively  studied 
as  the  other  types  of  models. 
Models  for  Response  Time 

An  important  quantity  for  the  evaluation  of  a  data  management 
system  is  the  expected  response  time,  which  may  be  defined  as  the  average 
waiting  time  from  the  initiation  of  a  data  request  (or  from  the  input  of 
a  query)  to  the  receipt  of  the  information.   Many  different  aspects  of  a 
data  management  system  have  an  effect  on  the  total  response  time.   These 
aspects  run  from  the  low-level  physical  organization  of  data  to  (in  a 
distributed  environment)  network  delay  times.   In  this  section  we  briefly 
review  some  important  past  work  on  modeling  these  various  aspects  and 
indicate  where  further  work  appears  needed  to  model  a  complete  system. 

Data  structure  modeling.   At  the  lowest  level,  models  have 
been  developed  to  aid  in  choosing  storage  schema.   A  typical  approach  is 
that  of  Gotlieb  and  Tompa  [1974].   They  consider  a  number  of  alternative 
structures  -  trees,  linked  lists,  etc.  -  and  assume  an  expected  usage 
pattern  which  involves  the  probabilities  that  the  various  nodes  in  the 
schema  will  be  accessed.   They  then  compute  expected  run- time  costs  for 
the  alternatives.   These  "costs"  are  actually  timing  estimates,  being 
computed  as  a  simple  linear  combination  of  "the  number  of  executions  of 
each  of  three  primitive  instruction  types:   memory  accesses,  arithmetic 


and  logical  instructions,  and  transfers  of  control".   The  expected 
number  of  executions  of  the  different  commands  are  obtained  by  simulation 
studies  of  application  programs.   This  last  point  is  an  important  one  - 
at  some  stage  in  almost  any  modeling  effort,  data  from  simulations  or 
from  the  measurement  of  real  system  behavior  is  needed.   Another  point 
to  note  in  Gotlieb  and  Tompa's  work  is  that  some  very  important  considera- 
tions in  choosing  storage  schema  appear  only  as  constraints.   For  example, 
an  upper  bound  on  the  allowable  amount  of  storage  space  is  used  to 
eliminate  certain  schema  from  further  scrutiny.   A  model  which  would 
allow  for  a  trade-off  between  storage  cost  and  access  efficiency  would 
seem  to  have  more  validity. 

At  what  is  perhaps  a  higher  level,  Shneiderman  [1974]  has 
developed  a  model  for  optimizing  the  structure  of  multilevel  indexes. 
Again,  he  describes  his  model  as  a  "cost"  model,  but  he  is  explicitly 
computing  access  times.   His  approach  is  a  very  simple  one.   Assuming 

1.  a  given  number  of  levels, 

2.  the  branching  pattern  of  the  index  tree, 

3.  a  strategy  for  searching  the  tree, 

4.  the  costs  (times)  for  moving  from  node  to  node  in  the  search, 
and 

5.  an  equal  probability  of  request  for  all  items, 

he  derives  a  straightforward  algebraic  formula  for  expected  search  time. 
As  an  obvious  (and  necessary)  generalization,  he  suggests  relaxing 
assumption  (5).   Shneiderman' s  basic  approach,  however,  appears  to  be  a 
useful  one  which  may  be  readily  incorporated  into  any  analyses  of  tree 
structures  which  arise  in  our  modeling  effort. 
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A  more  ambitious  effort  in  the  use  of  modeling  to  evaluate 
file  structures  was  carried  out  by  Winkler  and  Dale  [1971] .   In  their 
words,  they  study  "the  processing  time  required  to  evaluate  Boolean 
functions  defined  on  data  values  ...  [and  to]  select  elements  from  the 
structure  satisfying  the  expression".   They  derive  some  rather  complex 
algebraic  formulas  for  expected  processing  time.   There  are  over  twenty 
input  parameters  describing  such  things  as  properties  of  the  average 
query,  file  size  and  timing  data.   Specific,  alternative  structures  are 
modeled  in  the  sense  that  processing  time  formulas  are  developed  for 
them.   This  paper  merits  closer  study  in  our  proposed  work  on  data 
structuring. 

Computer  system  modeling.   Many  of  the  response-time  models  in 
the  literature  that  may  be  of  use  to  us  are  not  specifically  concerned 
with  data  management  but  with  computer  systems  in  general.   Both  time- 
sharing systems  and  multiprogramming  systems  have  been  the  subject  of 
considerable  analysis.   Both  situations  are  characterized  by  competition 
for  shared  resources.   Several  jobs  reside  in  the  system  simultaneously 
and  must  occasionally  wait  for  processing,  I/O,  etc.   The  natural  mathemat- 
ical models  to  describe  the  progress  of  jobs  through  such  a  system  of 
waiting  lines  and  processors  are  those  of  queueing  theory.   Indeed, 
queueing  theory  has  been  heavily  and  successfully  used  to  develop  formulas 
for  response  time  in  such  systems. 

A  classic  example  is  Scherr' s  analysis  of  response  time  for 
time-sharing  systems  [Scherr,  1967].   Scherr  defines  response  time  as 
the  mean  length  of  time  the  user  spends  in  the  "working  part  of  the 
interaction"  -  i.e.,  the  time  between  when  he  finishes  typing  in  his 
query  and  when  the  response  is  returned  to  him.   The  main  input  parameters 


are  the  mean  time  per  interaction  that  the  user  spends  in  thinking  and 
typing,  and  the  mean  processor  time  per  interaction.   Simplifying  assump- 
tions are  that  the  system  is  in  a  steady  state  (i.e.,  essentially  that 
the  total  number  n  of  users  on  line  is  constant)  and  that  there  is  no 
overhead  due  to  additional  swapping  as  n  increases.   The  latter  assump- 
tion is  questionable  and  leads  to  the  result  that  response  time  increases 
only  proportionately  to  n  for  large  n.   The  mathematical  analysis  is 
quite  simple.   A  Markov  process  describes  the  probability  distribution 
for  the  number  of  users  actually  inside  the  system  and  the  resulting  set 
of  recursive  equations  are  readily  solved.   Expected  response  time  can 
be  immediately  calculated  from  this  probability  distribution.   The 
validity  of  this  simple  queueing  model  was  demonstrated  by  comparing  its 
predictions  with  real  system  measurements.   The  agreement  was  extremely 
close. 

More  elaborate  analyses  of  time-sharing  systems  have  been 
carried  out  by  Kleinrock.   (For  a  good  review  of  this  work,  see  [Kleinrock, 
1973].)   Kleinrock' s  analyses  include  various  queueing  disciplines 
(scheduling  algorithms)  and  various  probabilistic  assumptions  on  job 
arrival  times  and  processing  time  required.   He  has  extended  this  type 
of  model  virtually  to  its  limit,  in  the  sense  that  further  generaliza- 
tions lead  to  intractable  mathematical  formulations. 

Queueing  models  also  play  a  key  role  in  the  study  of  multi- 
programming systems.   These  differ  from  time-sharing  systems  primarily 
in  that  there  is  no  assumed  interlude  when  the  user  is  thinking  and 
typing.   That  is,  a  certain  steady-state  population  of  jobs  is  assumed 
to  be  continually  moving  through  the  system.   A  model  which  seems 
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especially  relevant  to  our  work  Is  that  of  Arora  and  Gallo  [1971],  who 
are  particularly  interested  in  the  optimal  storage  of  data  in  a  multi- 
level memory.   They  define  the  expected  response  time  of  a.  transaction 
as  "the  serial  sum  of  the  service  times  along  with  the  respective 
waiting  times  at  all  facilities",  and  emphasize  the  importance  of  this 
statistic  in  evaluating  data  management  systems.   The  most  important 
parameters  in  their  model  are  the  I/O  dependent  timings,  such  as  the 
access  times  to  various  memory  devices  and  the  time  required  to  transfer 
a  block  of  data  from  auxiliary  to  main  memory.   The  model  is  rather 
detailed  and  complex,  but  has  the  obvious  potential  to  be  extended  to 
the  study  of  data  distribution  in  a  network.   One  need  only  consider 
some  memory  levels  to  be  located  remotely  and  take  into  account  network 
delay  times. 

File  allocation  modeling.   Models  which  have  been  devised  to 
study  data  distribution  are  usually  developed  from  higher  level  (and 
less  sophisticated)  analyses  than  those  referred  to  above.   An  example 
is  the  response-time  formula  derived  by  Chu  [1973]  in  his  study  of 
optimal  file  allocation  in  a  network.   Variables  in  his  formula  include 
line  traffic  between  nodes  (assuming  it  is  all  generated  from  data  base 
access),  usage  rates  of  files  by  users  at  various  sites,  and  average 
lengths  of  messages.   An  interesting  feature  is  the  result  that  network 
transmission  delays  increase  with  line  traffic  according  to  the  simple 
factor  P/(l-P),  where  P  denotes  the  fraction  of  line  capacity  used  by 
the  given  traffic,  or  traffic  intensity.   Indeed,  Chu's  expected  response- 
time  formula  (for  queries  initiated  at  one  given  site  and  responded  to 
by  another)  is  simply 

Response  Time  ~  tP/(l  -  P) , 
where  t  =  average  time  to  transmit  a  reply  message.   One  sees  that  many 
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features  are  lacking  in  this  simple  model  -  time  to  transmit  requests, 
time  to  access  the  data  at  the  remote  site,  protocol  overhead  time,  etc. 
In  addition,  there  appears  to  be  an  implicit  assumption  that  all  pairs 
of  sites  are  connected  by  a  direct  line  used  only  for  the  query  traffic. 

Network  delay  modeling.   As  we  noted  in  discussing  Chu's 
simple  model  for  response  time  from  a  remote  site,  real  network  delays 
involve  many  complex  factors.   Fortunately,  much  work  has  been  done  on 
developing  realistic  formulas  for  network  delays.   (For  a  good  review 
see  [Kleinrock,  1973].)   This  work  has  been  largely  done  in  the  setting 
of  network  design  and  analysis.   For  example,  queueing  models  have  been 
used  to  compute  average  packet  delays  for  given  network  topology,  routing 
strategy,  and  network  traffic  (including  overhead  for  routing,  flow  and 
error  control,  etc.).   There  seems  to  be  no  reason  why  such  models  can 
not  be  incorporated  into  overall  models  of  response  time  for  a  distributed 
data  base.   Detailed  modeling  of  network  delays  will  provide  a  necessary 
tool  for  studying  synchronization  strategies  and  other  network-related 
features  of  distributed  data  management. 
Models  for  Throughput 

Some  authors  argue  that  response  time  is  not  as  important  a 
statistic  as  is  throughput.   For  example,  Arora  and  Gallo  [1971]  put  the 
case  as  follows:   "In  a  multi-programming  environment  the  response  time 
does  not  measure  the  efficiency  of  the  system,  because  of  the  concurrent 
processing  of  several  transactions.   For  this  reason,  we  introduce 
throughput  rate  as  a  performance  measure  for  the  multi-programming 
systems.   It  is  the  rate  of  completion  of  transactions  per  unit  time . " 
(The  underlining  is  ours.)   In  his  analysis  of  multiprogramming  systems, 
Buzen  [1971]  takes  the  same  point  of  view,  defining  "overall  system 
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performance"  as  the  "average  number  of  jobs  processed  per  unit  time".   A 
queueing  theory  analysis  will,  however,  generate  either  response  time  or 
throughput  rate  with  equal  ease.   That  is,  these  models  assume  that  a 
certain  number  of  jobs  are  in  the  system  and  essentially  it  is  the  time 
the  average  job  spends  in  the  system  that  is  computed.   Thus  "response 
time"  in  these  models  never  means  the  absolute  time  it  takes  an  other- 
wise empty  system  to  do  the  job,  but  is  always  in  the  context  of  com- 
petition with  other  jobs. 

In  network  analysis  throughput  has  also  been  a  useful  statistic. 
For  example,  in  ARPANET  analyses  the  network  throughput  has  been  defined 
as  the  average  traffic  per  node  when  average  packet  delay  equals  0.2 
seconds  [Frank  and  Chou,  1974].   This  maximum  acceptable  average  time 
delay  then  gives  meaning  to  the  notion  of  throughput  or  "the  level  of 
traffic  that  the  network  can  handle".   However,  it  is  again  queueing 
analysis  which  is  used  to  model  the  flow  of  packets  through  the  network 
and  to  compute  throughput  under  various  conditions. 

Once  throughput  or  level  of  traffic  flowing  through  a  system 
becomes  a  statistic  of  interest,  the  possibility  of  using  models  analogous 
to  those  used  for  physical  flow  systems  arises.   The  stochastic  models 
and  recursion  equations  of  queueing  theory  may  be  replaced  by  the  contin- 
uous models  and  differential  equations  of  diffusion  theory.   There  has 
recently  been  considerable  interest  in  applying  this  type  of  model  to 
queueing  networks  (see,  for  example,  [Reiser  and  Kobayashi,  1974]), 
since  more  complex  initial  and  boundary  conditions  can  be  imposed  than 
are  tractable  in  stochastic  models.   Of  course,  the  close  connection 
between  throughput  and  average  response  time  means  that  diffusion  models 
could  be  very  useful  in  response-time  studies.   We  therefore  plan  to 
look  closely  into  the  applicability  of  diffusion  models  to  our  research. 
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Models  for  Availability 

We  here  use  the  term  availability  to  mean  the  fraction  of  time 
that  a  data  base  is  available  to  respond  to  user  requests  or  queries. 
In  any  setting,  and  particularly  in  a  network,  availability  is  a  func- 
tion of  the  reliability  (or  availability)  of  man;/  components  -  host 
computers,  network  communications  lines,  etc.  -  as  well  as  of  strategies 
for  backup  and  recovery.   In  this  section  we  discuss  some  of  the  past 
modeling  research  that  has  yielded  results  useful  to  us  in  our  concern 
with  database  availability. 

File  allocation  modeling.   One  of  the  factors  to  be  taken  into 
account  in  distributing  copies  of  a  file  to  various  network  sites  is  the 
number  of  copies  needed  for  an  acceptable  degree  of  availability.   Chu 
[1973]  takes  account  of  this  factor  in  the  following  way.   First,  he 
defines  the  availability  of  a  piece  of  equipment  (e.g.,  communication 

line  or  computer)  as 

F 
Availability  =  — — ; — — 
F  +  X 

where  F  is  the  mean  time  between  failures  and  X  is  the  mean  time  to 

repair.   Then,  assuming 

1)  all  computers  in  the  network  have  identical  availability  A, 

2)  all  communication  channels  have  identical  availability  c,  and 

3)  the  network  is  completely  connected; 

Chu  obtains  the  following  formula  for  the  availability  of  the  j th  file: 

r . 
A(l  -  (1  -  Ac)  J), 

where  r.  is  the  number  of  copies  of  the  i  th  file  in  the  network.   Once  A 
J 

and  c  are  known,  it  is  a  simple  matter  to  choose  r.  so  as  to  bring  the 
availability  of  a  remote  copy  up  to  a  satisfactory  level.   Overall 
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availability  is  bounded  by  A,  the  availability  of  the  requesting  com- 
puter, which  is  apparently  assumed  not  to  possess  a  copy  of  the  file. 

Although  Chu's  model,  with  its  assumption  of  complete  homo- 
geneity of  network  components,  may  seem  oversimplified,  an  analogous 
analysis  can  be  readily  carried  out  in  the  heterogeneous  case  to  yield 
more  complex  expressions.   Notice,  however,  that  this  model  presents 
another  problem.   It  implicitly  assumes  that  the  files  are  static,  or 
are  simultaneously  kept  up  to  date  by  some  trouble-free  process.   In 
fact,  the  development  of  algorithms  to  keep  segments  of  a  data  base 
identical  (or  nearly  so)  is  a  topic  of  current  research.   (See  the 
chapter  on  Automated  Backup  in  CAC  Doc.  No.  162,  JTSA  Doc.  No.  5509.) 

Network  reliability  modeling.   Another  simplification  in  Chu's 
model  is  the  assumption  that  a  direct  communication  line  connects  every 
pair  of  sites.   This  assumption  allows  Chu  to  use  a  single  parameter  to 
describe  availability  of  a  link  from  one  site  to  another.   In  a  general 
network,  this  availability  will  depend  in  a  complex  way  upon  network 
topology.   Several  alternate  paths  may  exist  between  two  given  sites. 
Each  of  these  paths  may  involve  more  than  one  "hop"  and  so  more  than  one 
piece  of  subnet  hardware.   Indeed,  in  the  ARPA  network  it  has  been  found 
that  the  failure  rate  for  IMP's  is  about  the  same  as  that  for  communica- 
tion channels,  and  that  IMP  failures  therefore  have  the  more  drastic 
effect  on  communications  reliability  [Frank,  Kahn,  and  Kleinrock,  1972]. 
Graph  theoretical  techniques  for  computing  availability  from  component 
reliabilities  are,  however,  well  known.   The  paper  by  Frank  et  al.  con- 
tains a  brief  review  of  these  techniques.   No  great  difficulty  is  envi- 
sioned in  applying  them  to  any  given  network  (such  as  the  WIN)  to  obtain 
availabilities  which  may  then  be  used  in  a  straightforward  extension  of 
Chu's  model  to  obtain  rough  estimates  of  file  (or  data  base)  availability, 
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Modeling  computer  system  reliability.   Another  parameter  in 
Chu's  model  that  requires  more  detailed  analysis  for  complete  understanding 
is  computer  availability.   One  source  of  information  on  computer  avail- 
ability is  direct  system  measurement.   On  a  lower  level,  however, 
failures  can  be  modeled  to  yield,  in  addition  to  overall  figures  on 
expected  system  reliability,  useful  insights  into  repair  and  backup 
strategies . 

Borgerson  and  Freitas  [1975]  recently  published  a  fairly 
detailed  stochastic  model  for  computer  system  failure.   Their  model  is 
based  on  four  distinct  causes  of  crashes  and  their  interrelationships. 
Their  ultimate  result  is  a  formula  giving  the  probability  density  for 
the  event  that  the  system  crashes  due  to  a  failure.   The  effects  of 
mechanisms  for  detecting  and  recovering  from  a  failure  (before  the 
system  actually  crashes)  are  included  in  the  analysis.   Although  our 
research  is  unlikely  to  be  concerned  with  modeling  computer  systems  at 
this  level  of  detail,  the  analytical  techniques  of  Borgerson  and  Freitas 
may  well  apply  to  reliability  problems  which  we  may  wish  to  model  (e.g. 
protocol  resiliency) . 

Modeling  backup  and  recovery  strategies.   This  section  has 
previously  dealt  with  availability  questions  involving  network  and  site 
reliabilities.   On  a  lower  level,  the  data  base  itself  may  "crash"  or 
may  acquire  errors.   It  is  important  that  strategies  for  returning  a 
data  base  to  its  correct  state  be  devised  and  studied. 

A  recent  paper  [Chandy  et  al. ,  1975]  provides  models  for 
rollback  and  recovery  strategies.   These  strategies  run  as  follows.   At 
certain  points  in  time  (checkpoints) ,  a  copy  of  the  data  is  made  and 
stored.   A  listing  of  subsequent  data  updates  (i.e.  an  audit  trail)  is 
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then  kept.   When  the  master  data  base  fails,  it  may  then  be  recovered  by 
beginning  with  the  old  copy  from  the  checkpoint  and  using  the  audit 
trail  to  bring  it  up  to  date.   Chandy  et  al.  use  queueing  theory  to 
model  the  processing  of  the  audit  trail.   From  the  expected  time  to 
complete  this  process,  they  can  compute  the  total  recovery  time.   The 
length  of  the  audit  trail,  and  hence  the  time  to  recover,  is  a  function 
of  the  time  interval  between  checkpoints.   Optimization  of  availability 
with  respect  to  intercheckpoint  time  can  then  be  carried  out.   Models  of 
some  complexity  are  developed  which  take  into  consideration  the  possi- 
bility of  errors  during  recovery  and  the  possibility  of  a  transaction 
arrival  rate  which  varies  in  a  cyclic  manner  (as  opposed  to  being  con- 
stant) .   The  results  appear  to  be  very  useful  for  developing  insights 
into  recovery  strategies,  particularly  for  single-site  systems.   In  a 
network  environment,  however,  it  may  be  reasonable  to  assume  that  the 
backup  copy  is  stored  remotely.   In  this  case  it  does  not  make  sense  to 
assume  that  the  data  is  always  restored  from  the  backup,  because  of  the 
long  time  required  to  transfer  a  data  base  through  the  network.   The 
strategy  then  is  to  transfer  the  queries  to  the  available  copy.   (See  the 
later  section  on  the  availability  model.) 
Models  for  Cost 

Cost  is  both  a  very  vague  and  ambiguous  measure  of  system 
performance  and  a  very  important  one.   The  ambiguity  comes  about  through 
the  difficulty  of  assigning  dollar  costs  to  all  factors  of  interest. 
One  way,  of  course,  is  to  carry  out  experiments  -  i.e.,  to  run  test 
programs  at  various  sites  and  compare  the  bills  received.   This  method 
yields  cost  comparisons  which  are  heavily  dependent  on  the  pricing 
policies  of  the  various  sites,  as  well  as  on  site  hardware  and  software. 
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Untangling  all  of  these  factors  to  determine  what  a  set  of  cost  figures 
really  means  is  no  easy  task.   On  the  other  hand,  cost  is  very  important 
in  that  it  serves  as  an  overall  measure  of  system  resource  utilization. 
For  example,  by  assigning  costs  to  them,  such  diverse  factors  as  CPU 
time  and  storage  used  can  be  added  together.   In  short,  costs  are  a 
device  by  which  one  can  add  together  apples  and  oranges. 

Assignment  of  specific  costs  to  various  factors  is  of  importance 
to  the  model  user,  but  not  necessarily  to  the  model  builder.   The  latter 
can  consider  costs  of  various  resources  to  be  simply  weighting  coeffi- 
cients, which  can  be  adjusted  at  will  to  reflect  a  specific  environment. 
It  may  be,  for  example,  that  no  real  money  changes  hands.   But  a  user 
may  still  wish  to  evaluate  a  certain  system  or  piece  of  software  by 
using  a  formula  which  weights  storage  (which  may  be  in  short  supply) 
much  more  heavily  than  CPU  time. 

In  this  brief  review  of  cost  models,  we  will  be  only  concerned 
with  those  which  use  "costs"  to  add  together  heterogeneous  factors.   In 
our  search  of  the  literature  we  found  that  several  so-called  "cost" 
models  actually  dealt  only  with  time  factors.   Such  models  are  therefore 
discussed  elsewhere. 

Modeling  Network  File  Allocation.   Of  particular  relevance  to 
our  study  of  distributed  data  management  are  the  cost  analyses  developed 
for  the  network  file  allocation  problem.   A  good  example  of  such  an 
analysis  is  that  given  by  Casey  [1972],   The  parameters  in  his  model  are 

1.  the  cost  ("mainly  for  storage")  of  locating  the  file  at  any 
site  k, 

2.  the  costs  of  transmitting  a  given  amount  of  data  between  two 
given  sites  (with  the  possibility  that  update  and  query  trans- 
actions may  be  transmitted  at  different  costs) , 
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3.  the  amount  of  update  traffic  emanating  from  each  site,  and 

4.  the  amount  of  query  traffic  emanating  from  each  site. 

Given  values  for  these  parameters,  the  cost  of  a  particular  allocation 
is  readily  computed. 

Casey  states  that  transmission  costs  may  be  "a  rather  complex 
monotonically  increasing  function"  of  traffic,  but  he  feels  that  his 
linear  model  is  a  good  first  approximation.   A  better  idea  of  transmission 
costs  would  require  a  model  which  goes  into  the  transmission  process  in 
some  detail  and  analyzes  the  various  cost  components  and  how  they  are 
affected  by  the  amount  of  network  traffic.   The  site  costs  might  also 
profit  from  a  detailed  breakdown;  note  that  Casey  remarks  that  factors 
other  than  storage  are  being  lumped  into  one  term.   It  is  important  to 
realize,  however,  that  for  file  allocation  Casey's  model  is  probably 
quite  adequate.   It  is  only  when  one  wishes  to  study  other  aspects  of 
data  distribution  -  backup  and  recovery  strategies,  say  -  that  more 
detail  is  needed. 

Modeling  storage  hierarchies.   Even  before  networks  existed, 
the  file  allocation  problem  was  of  importance.   The  question  arose  as  to 
where  one  should  place  a  given  file  in  a  storage  hierarchy  -  i.e.,  a  set 
of  memory  devices  of  varying  accessibility  (core,  disk,  tape,  etc.) 
connected  to  a  single  computer.   A  particularly  comprehensive  cost  model 
for  this  problem  has  recently  appeared  [Lum  et  al. ,  1975].   This  model 
differentiates  between  random  and  sequential  forms  of  data  access  and 
includes  considerations  of  staging,  channel  costs,  CPU  overhead,  etc. 
Because  of  its  completeness,  we  considered  this  model  an  appropriate  one 
for  extension  to  the  network  case.   That  is,  memory  devices  at  a  remote 
site  may  simply  be  considered  as  parts  of  the  storage  hierarchy,  provided 
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that  network  costs  are  properly  taken  into  account.   A  detailed  discussion 
of  the  model  of  Lum  et  al.  will  therefore  appear  below  in  the  discussion 
of  our  cost  model. 

The  distributed  data  management  problem  is  of  course  far  more 
complex  than  the  storage  hierarchy  problem.   The  model  of  Lum  et  al.  (and 
our  extension  of  it)  assumes  that  all  data  processing  (updating  and 
responding  to  queries)  takes  place  in  local  core.   No  provision  exists 
for  sending  a  query  to  a  remote  site  for  processing.   Thus,  although  our 
straightforward  extension  of  Lum's  storage  hierarchy  model  has  provided 
some  insight  into  data  distribution,  it  is  grossly  inadequate  for  studying 
all  the  many  facets  of  distributed  data  management.   Unfortunately,  the 
literature  contains  little  modeling  work  that  is  readily  applicable  to 
distributed  data  management.   Much  more  work  needs  to  be  done  to  develop 
models  which  realistically  describe  the  distributed  environment. 
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A  Cost  Model  for  Data  Distribution 

Introduction 

The  advantages  of  distributing  a  data  base  in  a  network 
environment  have  been  discussed  at  length  in  various  papers,  panel 
discussions,  and  bull  sessions.   But  it  has  been  somewhat  difficult  to 
quantify  these  advantages  or  to  investigate  the  various  tradeoffs  and 
determine  just  how  great  the  advantages  are.   In  this  section,  we  will 
attempt  to  shed  some  light  on  this  subject.   As  mentioned  above,  a 
recent  paper  by  Lum  et  al.  [1975]  develops  a  cost  algorithm  for  allo- 
cating files  in  a  storage  hierarchy.   Their  cost  model  is  rather  com- 
plete and  lends  itself  well  to  extensions  relevant  to  storage  hierarchy 
problems  in  distributed  data  base  systems. 

For  many  of  the  cost-related  questions  that  arise  in  the 
development  of  a  distributed  data  base  system  (such  as  those  concerned 
with  the  costs  of  queries,  updates,  back-up,  recovery,  etc.),  the  system 
can  at  first  be  viewed  as  a  storage  hierarchy.   That  is,  to  a  local 
process  or  user  the  remote  sites  appear  as  further  levels  of  the  hier- 
archy.  From  this  point  of  view  the  network  is  another  channel  with  some 
special  cost  considerations.   In  future  refinements  of  this  model,  we 
plan  to  include  effects  of  remote  processing  of  data.   We  were  unable 
to  do  so  in  this  short-term  effort. 

In  what  follows  we  will  first  review  the  model  described  in 
[Lum  et  al.,  1975].   (In  order  to  facilitate  the  discussion,  this  model 
will  be  referred  to  henceforth  as  the  LSWL  model.)   Next  we  will  extend 
the  LSWL  model  to  include  a  network.   Then  we  will  use  the  model  along 
with  some  relevant  data  to  investigate  some  interesting  questions,  and 
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we  will  draw  some  conclusions  about  the  advantages  of  distributed  data 
base  systems.   Finally,  we  list  some  ideas  for  refining  the  model. 
A  Review  of  the  LSWL  Model 

Overview.   The  LSWL  model  primarily  addresses  the  problem  of 
"data  staging"  or  "data  migration".   In  other  words,  when  a  file  or  data 
set  is  not  being  used  (i.e.,  is  inactive)  it  is  stored  on  one  device 
(usually  a  slower,  less  expensive  one).   Then,  when  the  data  set  is 
accessed,  it  is  moved  to  a  faster,  more  expensive  device  so  that  the 
program  will  waste  fewer  resources  waiting  for  data.   The  question  we 
are  concerned  with  here  is,  given  the  accessing  characteristics  (number 
of  reads  and  writes,  proportion  of  time  the  file  is  in  use,  etc.),  where 
in  a  given  hierarchy  should  the  data  set  be  stored  when  it  is  inactive 
and  where  should  it  be  moved  when  it  is  active? 

The  authors  develop  an  objective  function  which  gives  the  cost 
of  accessing  a  data  set  which  is  stored  on  one  device  when  inactive  and 
another  (possibly  the  same  device)  when  active.   The  authors  assume  as  a 
first  approximation  that  the  entire  data  set  is  moved  from  the  inactive 
device  to  the  active  one. 

The  selection  algorithm  is  quite  straightforward.   The  objec- 
tive function  is  evaluated  for  a  given  set  of  variables  for  each  pair  of 
devices  in  the  hierarchy.   The  lowest  cost  then  indicates  on  which  pair 
of  devices  the  data  should  be  located. 

Assumptions.   The  authors  make  several  simplifying  assumptions, 
most  of  which  can  be  relaxed  at  the  cost  of  a  more  complex  cost  function. 
They  assume  that  for  data  sets  system  paging  activity  will  not  signifi- 
cantly affect  cost.   However,  it  would  probably  be  necessary  to  relax 
this  constraint  if  one  wished  to  consider  costs  incurred  by  program 
activity.   They  further  assume  that  transfers  are  direct  rather  than 
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through  core  and  that  there  are  no  flow  control  problems  (i.e.,  a  fast 
device  can  always  accept  data  from  a  slow  device) .   It  is  also  assumed 
that  transfers  are  not  constrained  by  the  capacity  of  the  device  the 
data  set  is  being  moved  to.   These  last  two  assumptions  can  both  be 
dropped  at  the  cost  of  a  more  complex  equation.   As  we  shall  see,  when 
we  add  a  network  to  the  hierarchy,  flow  control  can  not  be  ignored. 

When  a  process  or  user  accesses  a  data  set,  it  often  must  wait 
for  the  access  to  complete.   Clearly,  this  wait  time  must  be  figured 
into  the  total  cost.   However,  multiprogramming  systems  take  advantage 
of  this  wait  time  by  letting  other  processes  utilize  the  processor.   To 
account  for  this  the  authors  define  an  adjusted  machine  cost,  m.   For 
lack  of  a  better  formulation,  they  have  defined  this  cost  to  be  the 
percent  of  CPU  idle  time  times  the  dollar  cost  associated  with  the  CPU. 
There  are  some  problems  with  such  a  definition.   For  example,  as  the 
load  on  the  system  increases  and  so  does  CPU  utilization,  queueing 
delays  and  system  overhead  also  increase,  thus  increasing  cost.   The 
objective  function  does  not  account  for  this  phenomenon. 

The  objective  function.   Now  that  we  have  reviewed  the  assump- 
tions behind  this  analysis,  let  us  look  at  the  cost  function  itself  in 
some  detail.   The  reader  should  consult  Table  1  for  a  key  to  the  defini- 
tion of  the  symbols  used  and  Figure  1  for  a  summary  of  the  objective 
function. 

Let  us  assume  that  the  data  set  is  at  level  i  of  the  hierarchy 
when  inactive  and  level  j  when  active.   (For  consistency  we  will  adopt 
the  nomenclature  used  by  Lum  et  al.  whereby  the  first  subscript  will  be 
the  inactive  device,  and  the  second  the  active  one.   Also  the  higher 
levels  (i.e.,  those  with  faster  access)  of  the  hierarchy  will  have 
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Data  Set  Characteristics: 

q  =  number  of  sequential  block  assesses. 
r  =  number  of  random  block  accesses. 
S  =  data  set  size, 
s  =  physical  block  size. 

t.  =•  fraction  of  time  data  set  is  on  level  i. 

1 

d  =  number  of  times  the  data  set  is  opened. 

A  =  the  proportion  of  time  to  write  the  data  set  back  to  its 

original  position.   For  read  only  data  sets,  X  =  0;  for 

full  write  back  at  read  speed  X  =  1. 

Storage  Device  Characteristics: 

t   =  random  access  time  for  level  i. 
r 

t    =  sequential  access  time  for  level  i. 

q 

t   =  transmission  rate  for  level  i. 

s 

t   =  average  revolution  latency  time  for  level  i. 

t   =  minimum  access  arm  movement  time, 
c 

n.  =  unit  cost  of  storage  space  at  level  i  for  the  given  time 

period. 

b.  =  transfer  size  per  access  when  data  set  is  being  moved  from 

a  lower  level  i  to  another  level  (or  from  a  higher  level  to 

level  i) . 

B.  =  largest  size  that  can  be  transferred  without  additional  access 
i 

cost . 

CPU  and  Channel  Characteristics: 

m  =  adjusted  cost  per  unit  time  for  cojpputer  system  excluding 

channel 
M  =  unadjusted  computer  system  cost  per  unit  time 
u  =  cost  of  channel  per  unit  time 
3  =  number  of  buffers 
W  =  computer  setup  time  for  opening  a  data  set 

Table  1 

Parameters  in  the  LSWL  Model 
(from  [Lum  et  al. ,  1975]) 
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higher  indices.)   The  objective  function  can  be  considered  to  have 

three  major  terms: 

i    i  staging 

storage    ,    local  process  ° 

f .  .  =       °    +         r  +   transfer 
it     cost         access  costs 
J  costs 

The  first  term  is  the  cost  of  storing  the  data  on  the  active 

and  inactive  devices. 

{storage  cost}  =  [x.n.  +  T.n.lS 

i  i    J  3 

When  a  data  set  is  moved  from  level  i  to  level  j  it  is  not  necessarily 

deleted  from  level  i;  therefore  it  should  be  noted  that  x.  +  t.  >  1. 

i    J  - 

The  second  term  is  the  cost  for  the  user  or  process  to  access 

the  data  from  the  active  device.   This  term  takes  into  account  the  CPU 

costs  and  transfer  overhead  as  well  as  channel  costs  for  both  random  and 

sequential  accesses. 

CPU  costs  for  r/  j  ...     ,  /  /   iN1 

«_.  ,  =  mq[(t  J/3)  +  (s/t  J)l 

sequential  access  q  s 

CPU  costs  for       r  \         ,    i      1m 
,  =  mr[t  J  +  (s/t  J) ] 

random  access         r        s 

random  access   _    r   j    .     ,   j.  , 
channel  costs         1        s 


sequential  access       r/  \  ...  ,    ,      i .  , 

channel  costs      -^[(t^/3)  +  (s/t  J)] 


The  final  term  computes  the  cost  of  moving  the  data  from  level 

i  to  level  j  and  includes  factors  for  writing  the  data  back  to  level  j 

if  necessary,  preparation  for  transfer,  latency  waiting  for  the  next 

block,  and  block  transmission  costs. 

cost  to  move  data  from     ,.  ,  ,  N  , ,  T  ,,„.,»,    i    ,  ,  ,   i. 
level  i  to  level  j       =  (1  +  A)d{MW  +  (S/V  [mtl  +  (mbi/fcs  } 

+  (ub./t  X)]  +  (mS/B.)t  1}r(i  -  j), 
where  f(x)  is  0  if  x  =  0  and  is  1  otherwise. 
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f .  .  =  (t.ti.  +  T.n.}S  + 


storage  cost 


mq[(t  j/3)  +  (s/t  j)]  + 
q  s 


mr[t  j  +  (s/t  h]    + 


(uqfCt^/3)  +  (s/tgj)]  + 
ur[txj  +  (s/tsj)]}  + 

(1  +  A)d{MW  +  (S/b.) [mt11  +  (mbi/ts1)  + 
(ub./t  1)]  +  (mS/B.)t  X}{r(i  -  j)} 


1   s 


1   c 


CPU  cost:   sequential 
access  +  transmission 

CPU  cost:   random 
access  +  transmission 


channel  cost 


cost  to  move  the 
data  set  between 
levels  i  and  j 


Figure  1 
Objective  Function  for  the  LSWL  Model 


Network  Model 


Further  extensions  than  those  discussed  here  are  necessary  to 
model  the  cost  of  a  distributed  data  management  system  in  complete 
detail.   However,  the  model  developed  here  is  a  good  first  approximation 
and  will  allow  investigation  of  the  tradeoffs  between  storage  and  access 
economy.   It  will  also  provide  an  accurate  model  of  file  or  data  set 
staging  in  a  network. 

As  mentioned  earlier,  a  primary  concern  in  extending  the  LSWL 
model  to  allow  for  a  network  in  the  hierarchy  is  to  account  for  the  flow 
control  and  other  protocol  related  costs  that  will  be  incurred.   The 
cost  function  used  has  the  basic  form: 


c  .  . 
ij 


i  >  k 
i  <  k 


> 


(j  always  greater  than  k) 


S 


where  k  is  the  first  remote  level  of  the  hierarchy.   (Here  we  are 
tacitly  assuming  that  all  staging  will  be  done  to  a  local  device.)   We 
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have  already  discussed  the  original  objective  function,  f...   We  will 

now  proceed  to  consider  the  cost  function  that  deals  with  the  network. 

The  reader  is  directed  to  Table  2  for  a  key  to  additional  symbols  and  to 

the  summary  of  g..  in  Figure  2.   The  network  cost  function  can  be  char- 

acterized  as: 

cost  to  move  from  inactive       cost  to  move  from 
storage 
e..  =  +   remote  level  to  highest      +   highest  remote 

ii      cost 
J  remote  level  level  to  the  net 

network       cost  to  move  from  net  to         process  access 
cost  active  level  costs 

The  major  differences  in  this  equation  from  the  purely  local 

version  are  the  added  network  costs  and  the  distinction  between  local 

and  remote  charging  rates.   Otherwise  most  of  the  terms  are  special 

cases  of  the  original  and  we  will  not  discuss  them  in  detail. 

The  network  costs  consist  of  two  major  components:   the  set-up 

costs  for  using  the  network  and  the  cost  of  the  traffic  sent  on  the 

network. 

network   costs   =   de{  (m     +  til  )  t        +    (M     +  M  )  t      }{l  +   T(X)} 

+    (1  +   X) (SKn   /b    )d 

+  2en  d{l  +  EjQt) } 

The  first  term  is  the  cost  of  setting  up  the  transfers  in 
terms  of  the  number  of  message  exchanges  required  (protocol  negoti- 
ation) ,  network  delay  and  protocol  processing.   The  other  two  terms  are 
network  charges  for  the  packets  actually  sent.   The  first  of  these  is 
the  cost  for  the  data  sent  and  the  second  is  for  the  messages  sent  for 
the  set-up  negotiation.   The  constant  K  in  the  first  term  is  a  "com- 
pression" factor  to  allow  inclusion  of  data  compression  and  protocol 
overhead  in  data  transmission  (headers,  restart  markers,  etc.).   The 
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e  =  number  of  message  exchanges  necessary  to  set  up  the  transfer 

t  ,  =  message  round  trip  delay  time  in  the  network 
nd 

t   =  CPU  time  for  protocol  overhead  (on  a  per  protocol  message  basis) 
np 

K  =  "compression"  factor 

t   =  network  CPU  time  to  receive  data 
nr 

t   =  network  CPU  time  to  transmit  data 
nt 

u  =  remote  channel  cost 
r 

u  =  local  channel  cost 

Li 

m  =  adjusted  remote  system  cost 
r 

itl.  =  adjusted  local  system  cost 

N  =  number  of  data  set  copies  necessary  to  achieve  a  desirable 
level  of  reliability 

n.  =  network  transmission  cost 

k 

M  =  unadjusted  remote  system  cost 


>L  =  unadjusted  local  system  cost 

b,  =  network  packet  size 
k 


Table  2 
Supplementary  Parameter  List  for  Network  Model 
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g..    =   {x.n.   +  T.n.}S  +  (1)    storage   cost 

ij  i   i  3    3 

(1   +   X)d{(SK/b.  )  [m  b,/t  k  +  u  b, /t   k]}   +  (2)    cost    to   move   between 

krks  rks  ,  .    ,  ,         , 

highest  remote  level 

and  net 

(1   +   X)d{M  W  +    (S/b.)[m   tT X  +    (m  b./t    1)    +  (3)    cost    to  move   between 

L  irL  ris  .  ._._.... 

inactive  level  i  and 

(u  b./t      ) ]    +    (m   S/B.)t      }   +  highest   remote   level 

risric 

de{(m  +  m  )t   +  (M  +  M  ) t   }{1  +  T (X)  }  +  (4)  protocol  set  up  cost 

2en  d{l  +  f(X)}  +  (5)  network  charges  for 

protocol  messages 

(1  +  X)(SKn  /b  )d  +  (6)  data  transfer  network 

costs 

(M  t   +  M^  t   )(S/b1)d  +  (7)  network  software  cost 

r  nt    L  nr     k.  ,  .       . 

to  send  data  and 

A(M  t   +  M,  t  J  (S/b.  )d  +  receive  it 

r  nr    1  nt     k 

ni  q[(t  J/3)  +  (s/t  J)]  +  mjTtt  3    +   s/t  J  ]  +      (8)  CPU  costs  for  random 

and  sequential  access 
and  for  retrieval  from 
active  location 

uTq[(t.J/g)    +    (s/t   h]    +  uTr[tT^    +    (s/t   h    +  (9)    channel   costs    for 

LL  s  LL  s  , 

local  retrieval 

(1  +  X)d{SK/b.[(nLb  /t  k)  +  (uTb./t  k)]}        (10)  cost  to  move  from  net 

buffers  to  active 


device 


Figure  2 
Objective  Function  for  the  Network  Model 
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transmission  cost  of  the  network,  il  ,  is  calculated  in  terms  of  packets 
sent,  a  charging  structure  in  use  in  the  commercial  domain.   (It  should 
be  noted  that  the  symbols  with  the  subscript  k  do  not  refer  to  the 
properties  of  the  highest  remote  level  of  the  hierarchy  but  to  proper- 
ties of  the  network,  such  as  transmission  rate,  packet  size,  etc.) 
Factors  involving  A  are  included  in  the  network  costs  to  take  account  of 
the  possibility  of  shipping  the  data  back  to  inactive  store.   Notice 
that  a  transfer  must  be  set  up  no  matter  how  small  an  amount  is  sent 
back  -  hence  the  appearance  of  F(X)  in  the  formula. 

Example.   Consider  a  situation  in  which  there  is  a  four-level 
hierarchy  (core,  drum,  disk,  and  archive),  both  locally  and  at  a  remote 
site.   Assume  that  values  of  the  relevant  parameters  are  as  given  in 
Table  3  (taken  from  Lum  et  al.  [1975])  and  that  they  are  the  same  at 
both  sites. 


Parameter 

Core 

Drum 

Disk 

Archive 

Units 

t  i 

r 

lO"6 

5  X  10"3 

60  X  10"3 

5 

second 

t  i 
s 

ou 

106 

3  X  105 

4 
5  X  10 

byte/sec 

i 

t 

q 

0 

8  X  10~3 

13  X  10"3 

25  X  10~3 

second 

i 

0 

8  X  10"3 

12  X  10~3 

20  X  10"3 

second 

t  * 
c 

0 

0 

25  X  10"3 

40  X  10"3 

second 

n. 

2  X  10"2 

-4 
5  X  10 

3  X  10"5 

3  X  10~7 

$/byte/ 

i 

month 

b. 

i 

* 

20,000 

7,000 

2,000 

byte 

B. 

l 

* 

4  X  106 

140,000 

10,000 

byte 

*  Irrelevant 

Table  3 
Parameters  for  Storage  Hierarchy 
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It  does  not,  of  course,  make  sense  to  consider  inactive  storage  at 
remote  core,  and  this  case  is  omitted.   Let  the  number  of  local  buffers 
be  two  (3  =  2)  and  assume  that  there  is  no  setup  time  to  open  a  data  set 

Q 

(W  =  0) .   Suppose  that  a  data  set  of  10  bytes  is  active  for  one  eight- 
hour  shift  per  day,  so  that  on  a  per-month  basis  d  =  30  (i.e.,  the  data 
set  is  opened  once  per  day).   Furthermore,  the  set  is  then  active  1/3 
of  the  time  (x.  =  1/3),  and  we  shall  assume  that  x.  =  1  (i.e.,  that  the 
set  is  permanently  resident  at  the  inactive  location) .   Let  the  set  be 
blocked  into  1500-byte  physical  records  (s  =  1500)  and  suppose  that 
A  =  1  (so  that  the  data  set  is  always  written  back  at  the  end  of  each 
day).   Finally  assume  that  there  are  90,000  sequential  accesses  to  the 
active  copy  per  month  and  210,000  random  accesses  (i.e.,  q  =  90,000  and 
r  =  210,000).   These  values  all  correspond  to  those  used  by  Lum  et  al. 
in  their  example. 

Next,  network  parameters  are  needed.   We  have  taken  b,  =  125 

k 

k         3 
bytes,  the  ARPANET  packet  size;  t  n  =  200  ms  and  t    =  5  x  10  bytes/sec, 
J  nd  s 

both  ARPANET  figures;  t   =  1  ms ,  which  is  roughly  the  time  for  an  ARPA 

NCP  to  handle  one  protocol  command  (including  response);  t    =  1  ms ,  an 

nr 

average  figure  which  runs  from  about  .5  ms  NCP  time  to  2  ms  if  the 

process  must  be  awakened;  and  t   =  2  ms ,  which  consists  of  about  1  ms 

nt 

to  get  to  the  NCP  and  0.5  to  1  ms  to  use  it.   (These  estimates  for  t   , 

np 

t   ,  and  t  J  were  supplied  to  us  by  G.  Grossman  of  the  Center  for 
nr       nt  J 

Advanced  Computation.)   It  should  be  noted  that  both  t    and  t    should 

nr      nt 

be  slightly  larger  to  allow  for  data  processing  by  the  file  transfer 
protocol.   This  is  particularly  true  if  data  compression  is  being 
carried  out.   But  for  this  example  we  initially  assume  K  =  1.   Also,  t 
and  t   as  given  are  times  per  message;  we  have  divided  by  8  to  get  a 
per-packet  estimate,  since  a  maximum  of  8  packets  per  message  is 
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nr 


allowed.   The  parameter  e  was  set  at  15.   This  is  arrived  at  as  follows. 
In  the  ARPANET,  it  requires  7  exchanges  to  open  an  FTP  connection,  plus 
from  4  to  7  commands  to  set  parameters  and  3  more  to  open  the  data 
connection.   It  should  be  noted  that  by  using  ARPANET  data  and  the 
values  supplied  by  Grossman  we  are  essentially  computing  lower  bounds  on 
network  costs.   In  other  environments  the  network  costs  will  be  higher 
and  results  are  likely  to  be  quite  different. 

Finally,  cost  estimates  are  needed.   For  network  transmission 
we  assumed  n,  =  $1.25  per  1000  packets,  a  quoted  Telenet  commercial  rate. 

K. 


To  begin  with  we  have  assumed  that 


=  m 


=  $10/hr.,  M  =  Mr  =  $100/hr, 


°L    r 
and  u  =  u  =  $8/hr.   Clearly  under  these  assumptions  remote  storage 

will  not  be  cost  effective;  but  by  adjusting  the  cost  of  the  remote  site 

relative  to  that  locally,  we  should  reach  a  point  where  remote  storage 

is  cheaper.   The  values  calculated  for  costs  c..  (see  Figures  1  and  2)  are 

given  in  Table  4.   As  expected,  remote  storage  is  not  economical  for  the 

assumed  cost  structure.   The  cheapest  method  is  for  the  inactive  data  to 

be  stored  on  local  archive  and  transferred  to  local  disk  when  active. 


Active  Location  ( j ) 

Local 
Core 

Local 
Drum 

Local 
Disk 

Local 

Archive 

•H 

c 

o 

•H 
4-1 

CO 
u 
o 
hJ 

> 

•H 

4-1 

u 
cO 

a 

H 

Local  Core 
Local  Drum 
Local  Disk 
Local  Archive 
Remote  Drum 
Remote  Disk 
Remote  Archive 

2000 
717 
670 
668 
724 
677 
675 

50.0 
19.8 
17.5 
73.7 
26.9 
24.9 

3.05 
1.91 
58.1 
11.3 
9.27 

3.01 
60.0 
13.2 
11.2 

Table  4 

Computed  values  of  total  costs  c. .  for  the  basic  example, 

ij 
Entries  are  in  thousands  of  dollars  per  month. 
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In  an  attempt  to  determine  for  what  relative  costs  it  becomes 

cost-effective  to  store  remotely,  we  recomputed  the  c.'s  for  a  decreasing 

sequence  of  values  of  M  ,  m  ,  and  u  .   All  other  parameters  were  kept 
m  r   r       r 

the  same.   Even  when  the  cost  ratio  Z  was 

m    M    u 

™L   \    UL 
the  best  strategy  was  still  to  store  on  local  archive  and  transfer  to 
local  disk.   At  this  point,  however,  the  best  remote  strategy  (remote 
archive  to  local  disk)  was  less  than  twice  as  expensive  as  local  archive 
to  local  disk  (compared  with  a  factor  of  more  than  4  in  the  Table  4 
example) .   Closer  examination  of  the  individual  terms  computed  showed 
that  what  keeps  remote  storage  from  becoming  cost  effective  are  fairly 
large  contributions  from  terms  (2)  and  (6)  (cost  to  move  from  highest 
remote  level  to  net  and  data  transfer  network  costs,  see  Figure  2).   In 

Q 

short,  shipping  a  data  base  of  10  bytes  back  and  forth  across  a  network 
daily  is  just  not  likely  to  be  cost  effective  under  most  conditions! 

If  costs  for  shipping  through  the  network  are,  as  it  appears, 
making  remote  storage  uneconomical,  compression  of  the  data  before 
shipment  should  help.   We  therefore  inserted  a  compression  factor 
K  =  0.1  (about  as  small  as  is  realistic)  into  the  model  and  recomputed 
the  c..  for  cost  ratio  Z  =  0.1,  and  all  other  parameters  the  same  as  for 
the  Table  4  example.   Remote  storage  now  becomes  cost  effective  -  the 
best  strategy  is  to  transfer  from  remote  archive  to  local  disk.   (See 
Table  5.)   The  reader  should  keep  in  mind  throughout  this  discussion 
that  the  numbers  and  comparisons  given  here  should  not  be  taken  too 
literally.   The  simplistic  hierarchical  storage  model  we  are  using  does 
not  take  into  account,  for  example,  cost  advantages  which  may  occur  due 
to  remote  data  processing. 
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Active  Location  (j) 

Local 
Core 

Local 
Drum 

Local 
Disk 

Local 
Archive 

•H 

c 

o 

■H 
4-1 

CO 

CJ 

O 
kJ 

> 

•H 
4-1 

a 
cfl 

e 

H 

Local  Core 
Local  Drum 
Local  Disk 
Local  Archive 
Remote  Drum 
Remote  Disk 
Remote  Archive 

2000 
717 
670 
668 
717 
670 
667 

50.0 
19.8 
17.5 
67.1 
20.1 
17.2 

3.05 
1.91 
51.4 
4.46 
1.59 

3.01 
53.4 
6.39 
3.52 

Table  5 


Computed  values  of  c.  for  K  =  0.1,  cost  ratio  Z  =  0.1 
Entries  are  in  thousands  of  dollars  per  month. 


Starting  at  this  point,  we  increased  Z  (since  a  ten-to-1  cost 
ratio  is  probably  unrealistic)  to  see  at  what  value  of  Z  remote  storage 
begins  to  become  cost  effective  (for  K  =  0.1).   Throughout  the  range  of 
Z  values,  the  best  local  strategy  is  archive  to  disk;  the  best  remote 
strategy  is  remote  archive  to  disk.   We  have  graphed  the  costs  of  these, 
versus  Z,  in  Figure  3.   The  local  strategy  cost  is,  of  course,  independent 
of  K  or  of  remote  costs  (and  hence  of  changes  in  Z) .   Notice  that  the 
crossover  occurs  at  Z  =  0.3  -  a  value  which  might  well  occur  in  practice. 
Another  interesting  feature  is  the  linearity  of  the  curve,  which  may 
make  the  model  more  useful  as  input  into  decision  algorithms. 

An  unexpected  result  was  that  decreasing  S  (the  data  base 

7  f\ 

size)  to  10  bytes  and  then  to  10  bytes  led  to  virtually  no  change  in 

this  crossover  value  of  Z.   Even  at  10  bytes  the  local  and  remote  best 

strategies  are  nearly  of  equal  cost  at  Z  =  0.3.   But  for  this  small  a 
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data  base  the  best  active  storage  becomes  local  drum  instead  of  local 

9 
disk.   For  larger  data  bases  (10  bytes  or  more) ,  however,  the  main 

costs  are  those  for  storage,  and  the  minimum  in  the  matrix  corresponds 

to  permanent  storage  in  local  archive  (no  staging) . 

Though  insensitive  to  S,  it  is  clear  that  the  crossover  point 
is  sensitive  to  K.   To  investigate  this  feature  further,  we  generated 
the  second  curve  in  Figure  3,  for  which  the  only  difference  in  para- 
meters is  that  K  =  0.2.   As  expected,  the  crossover  point  has  decreased, 
and  to  about  Z  =  0.17.   To  a  good  approximation,  as  we  decrease  the 
amount  of  compression,  the  remote  costs  must  decrease  proportionately 
for  remote  storage  to  remain  cost  effective.   (A  quick  check  showed  the 
trend  holding  for  K  =  0.5.   In  this  case  remote  storage  is  almost  -  but 
not  quite  -  cost  effective  for  Z  =  0.1.) 

In  conclusion,  we  have  seen  that  remote  storage  of  even  very 
large  data  bases  may  be  economical,  providing  the  data  is  shipped  com- 
pressed and  there  is  a  sufficient  differential  between  local  and  remote 
costs.   However,  it  perhaps  is  not  reasonable  to  assume  that  the  whole 
data  base  is  transferred. 

A  more  realistic  model  would  allow  for  transferring  only  a 
portion  of  the  data.   One  approach  would  be  to  transfer  a  block  of  data 
only  when  needed.   Suppose  it  is  assumed  that  each  access  request 
initiates  a  transfer  of  the  relevant  block  or  blocks.   This  supposition, 
however,  contradicts  the  whole  basis  of  the  present  staging  model  - 
namely,  that  the  data  base  is  transferred  from  inactive  to  active 
storage  and  then  accessed  on  a  block  by  block  basis.   A  compromise  can 
be  achieved  by  assuming  that  only  a  portion  of  the  entire  data  base  is 
staged  daily,  as  discussed  below. 
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Revised  Model:   Partial  Staging 

In  this  paragraph,  we  will  consider  the  effects  of  altering 
the  model  to  take  account  of  the  possibility  that  only  a  part  of  the 
data  base  is  transferred  from  inactive  to  active  storage.   We  introduce 
a  new  parameter: 

S'  =  size  of  data  set  transferred. 
Then  the  following  changes  are  to  be  made  in  the  equations: 

In  Figure  1,  the  storage  cost  becomes  x.n.S  +  T.n.S',  and  in 
the  last  term  S  is  replaced  by  S ' . 

In  Figure  2,  the  storage  cost  (term  (1))  is  changed  just  as  in 
Figure  1.   In  addition,  all  other  occurrences  of  S  are  changed  to  S'. 

Figure  A  shows  the  results  of  some  computations  for  partial 
staging.   The  parameters  chosen  were  such  that  results  are  directly 
comparable  to  the  K  =  0.2  curve  in  Figure  3.   The  absolute  costs  have 
of  course  decreased  considerably.   However,  the  interesting  feature 
to  notice  is  that  the  crossover  point  is  virtually  unchanged  from 
when  the  entire  data  base  is  transferred.   This  seems  to  be  another 
aspect  of  the  relative  insensitivity  of  cost  effectiveness  to  changes 
in  data  base  size. 
Application  of  the  Model  to  Multi-site  Usage 

There  is  another  type  of  strategy  question  that  may  readily 
be  studied  by  using  the  model.   Suppose  users  at  two  sites  wish  to  use 
the  data  base,  but  it  will  be  used  more  heavily  at  one  (Site  A)  than 
at  the  other  (Site  B) .   Should  Site  B  use  Site  A's  copy  or  store  its 
own  locally?   In  this  section  we  give  an  example  of  this  type  of  problem 
and  its  solution. 


37 


500 


400 


S  =  I08 
K=  0.2 


S'=I07 


300 
COST 
200 


S'=IO' 


100 


_L_ 

1.0 


0.1 


0.5 

z 


Figure   4 

Comparative  costs  (in  dollars  per  month) 

Best  local  strategies 

Best  remote  strategies 


38 


o 

Suppose  that  the  data  base  is  10  bytes  in  size,  and  suppose 
that  costs  and  other  parameters  (except  those  describing  usage)  are 
the  same  for  both  sites  and  are  those  assumed  in  the  computation  of 
the  Table  4  entries.   Let  the  usage  at  Site  A  also  be  the  same  as  was 
assumed  in  computing  Table  4.   Then  we  know  that  the  best  strategy  from 
Site  A's  point  of  view  is  to  store  the  data  on  local  archive  and  stage 
it  to  local  disk.   We  therefore  assume  that  this  is  done,  at  a  cost 
of  $1,914  per  month  (from  the  computation  for  Table  4). 

Now  suppose  that  Site  B  only  uses  10  percent  of  the  data 
base  (perhaps  a  different  10  percent  on  different  days)  and  that  Site  B 
performs  far  fewer  accesses,  say  again  by  a  factor  of  10.   We  now  rerun 
the  model  to  obtain  the  c..  matrix  from  Site  B's  point  of  view.   The 
parameter  changes  to  do  this  are:   S  =  10  ,  q  =  9000,  r  =  21,000,  and 
A  =  0.   (This  last  change  is  made  because  we  assume  that  Site  A  makes 
all  the  changes  in  the  data;  Site  B  just  retrieves  it.)   The  resulting 
matrix  of  c . .  values  is  shown  in  Table  6.   (We  have  omitted  the  "local 
core"  column  here  because  the  storage  options  involving  core  are  too 
expensive  to  be  interesting.) 


Active  Location  ( j ) 

Local 

Local 

Local 

■H 

Drum 

Disk 

Archive 
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Local  Drum 

5001 

■H 

cd 

Local  Disk 

1974 

305 
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Local  Archive 

1712 

150 

301 

> 

Remote  Drum 

7021 

I  5458 

5652 

•H 
U 

Remote  Disk 

2329 

767 

960 

CO 
(3 
H 

Remote  Archive 

2081 

518 

712 

Table  6 

Matrix  of  costs  c.  for  Site  B.   (See  text.) 
Entries  are  in  dollars  per  month. 
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From  the  table,  we  see  that  Site  B's  best  local  strategy  is  to  store  on 
archive  and  stage  to  disk,  at  a  cost  of  $150.  Furthermore  we  see  that 
the  best  strategy  involving  A's  archive  as  inactive  storage  is  to  stage 
the  data  to  disk,  at  a  cost  of  $518.  This  includes,  however,  a  cost  of 
$3  per  month  to  store  10  bytes  in  A's  archive  and  this  storage  cost  is 
already  assumed  to  be  paid  for  by  A.  Thus  the  net  cost  to  B  is  $515  per 
month.   Finally  we  make  the  comparison: 

Total  cost,  storage  at  both  A  and  B  =  $2064  per  month. 

Total  cost,  storage  at  A  only       =  $2429  per  month. 
Not  surprisingly  (since  the  cost  of  storage  itself  is  so  small)  the 
first  option  is  the  cheaper.   However,  the  increased  cost  of  the  second 
option  is  only  $355  or  17  percent.   This  may  be  very  much  worthwhile  in 
view  of  the  problems  that  arise  in  trying  to  keep  more  than  one  copy  up 
to  date.   Furthermore,  this  computation  was  carried  out  with  K  =  1  (no 
data  compression) .   If  the  data  is  transferred  in  compressed  form,  say 
K  =  .25,  site  B's  best  local  strategy  is  as  before.   However,  the  best 
strategy  involving  A's  archive  as  inactive  storage  is  to  stage  the  data 
to  disk,  at  a  cost  of  $258  (subtracting  off  the  duplicated  storage  costs 
as  before) .   Thus  the  total  cost  of  the  second  option  is  $2172  per 
month,  which  is  an  increase  of  only  $108  or  5  percent. 
Plans  for  Further  Work 

Clearly,  much  more  can  be  learned  by  experimentation  with  the 
present  model.   By  using  parameters  that  describe  specific  systems  and 
their  costs,  we  should  be  able  to  develop  cost  comparisons  for  important 
real  applications.   In  addition,  we  have  looked  carefully  at  the  effects 
of  varying  only  a  few  of  the  many  parameters  in  the  model.   By  varying 
others,  we  should  gain  further  insights  into  costs. 


40 


We  might  also  investigate  other  approaches  to  deciding  on  a 
"best"  storage  policy  which  might  be  relevant  in  some  situations.   For 
example,  since  protocol  implementations  reside  as  user-level  processes 
in  many  operating  systems,  and  since  it  is  often  useful  to  consider  the 
data  set  as  being  staged  in  the  remote  system,  it  might  be  interesting 
to  consider  an  alternative  approach  which  runs  as  follows.   The  data  set 
allocations  on  the  remote  site  are  determined  according  to  the  LSWL 
model,  and  the  lowest-cost  strategy  is  selected.   The  cost  of  this 
strategy  plus  the  relevant  network  costs  are  then  used  to  form  the 
lowest  level  of  the  local  hierarchy,  where  the  cost  for  the  local  levels 
is  computed  using  the  LSWL  model  and  the  last  level  (the  remote  one) 
uses  a  slightly  modified  form.   Further  study  is  needed  to  determine 
whether  this  approach  will  yield  useful  data  for  decision  making. 

There  are  a  number  of  refinements  that  could  be  added  to  the 
model.   We  list  a  few  of  these  here. 

1)  There  could  be  a  provision  for  allowing  some  fraction  of  the 
queries  to  be  answered  locally,  while  the  rest  require  remote 
access.   (This  feature  may  be  useful  in  analyzing  the  cost 
effectiveness  of  intelligent  terminals  or  network  front-ends.) 

2)  The  effects  of  the  finite  size  of  the  storage  devices  might  be 
included . 

3)  As  mentioned  earlier,  the  definition  of  the  adjusted  system 
cost  does  not  appear  to  reflect  the  effects  of  increased  load 
on  the  system.   This  point  requires  more  investigation  to  gain 
a  better  understanding  of  this  parameter  and  of  how,  if 
necessary,  system  loads  may  be  inserted  into  the  model. 

4)  The  model  developed  by  Lum  et  al.  was  intended  to  represent 
file  migration  or  data  staging.   Thus,  when  a  data  set  is 
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written  back  to  the  inactive  device,  the  operation  is  con- 
sidered to  be  symmetrical  to  the  original  read.   If  this  model 
is  to  be  an  accurate  characterization  of  a  data  management 
system,  it  will  be  necessary  to  include  the  cost  of  performing 
updates . 
5)    Since  data  base  reliability  appears  to  be  one  of  the  major 
advantages  of  distributing,  it  is  very  important  that  the 
model  be  capable  of  evaluating  the  cost  of  various  multi-copy 
backup  schemes  with  respect  to  the  level  of  reliability  they 
provide.   We  have  therefore  provided  a  parameter,  N,  to 
indicate  how  many  copies  exist  in  the  network.   Unfortunately, 
we  have  not  yet  determined  how  this  parameter  should  be 
inserted  into  the  model. 
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A  Model  for  Distributed  Data  Availability 

Introduction 

In  this  section  we  attempt  to  quantify  the  improvement  in  data 
base  availability  which  can  be  achieved  by  storing  a  backup  copy  at  one 
(or  more)  remote  sites  in  a  network.   We  also  discuss  the  practicality 
of  certain  alternative  management  strategies. 

Availability  is  defined  as  the  probability  that  at  least  one 
copy  of  the  data  base  is  up  and  usable  as  a  master  copy  for  queries  and 
updates.   Alternatively,  availability  can  be  thought  of  as  the  fraction 
of  time  that  the  data  base  is  expected  to  be  available  for  use. 

To  simplify  the  analysis,  we  will  not  consider  various  possible 
causes  of  data  base  failure,  but  will  assume  that  the  data  is  available 
when  the  host  computer  is.   Furthermore,  we  will  not  take  into  account 
scheduled  down  time  of  the  host  computer,  on  the  assumption  that  if  down 
time  is  scheduled,  transfer  to  a  backup  copy  is  automatic  and  immediate, 
and  leads  to  no  loss  in  availability.  (The  very  existence  of  a  backup 
copy  at  an  alternate  network  site  will  of  course  improve  availability 
considerably  over  the  case  where  only  one  site  has  a  copy.) 
The  Model 

Parameters.   The  parameters  in  the  model  are  as  follows: 
F  =  mean  time  between  computer  failures,  assumed  to  be  the  same 

for  all  host  computers. 
X  =  expected  time  to  repair  computer. 

L  =  expected  time  to  load  the  data  base  copy  at  the  remote  site. 
Y  =  time  that  the  audit  trail  of  updates  has  been  growing  (i.e., 
time  since  the  copy  was  correct) . 
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k  =  the  ratio  of  update  arrival  rate  to  update  processing  rate, 

so  that  kY  =  time  to  process  the  audit  trail.* 
D  =  time  delay  between  when  the  master  fails  and  when  the  remote 
site  determines  this  fact  and  starts  to  get  its  copy  ready 
for  use. 
The  equations.   First,  consider  the  case  where  there  is  a 
single  copy  of  the  data  base.   The  availability  of  this  copy  is  then 

A   =  I ' 

o   F  +  X  +  kX 

This  is  the  usual  formula  for  availability  (mean  time  between  failures 

divided  by  mean  time  between  failures  plus  mean  time  to  recover) ,  where 

the  mean  time  to  recover  includes  repair  time  X  plus  the  time  kX  to 

process  the  updates  accumulated  while  repairs  were  made.   (This  formula 

for  recovery  time  is  that  used  by  Chandy  et  al.  [1975].)   There  is  a 

question  as  to  whether  the  term  kX  should  be  included  here,  since  the 

site  is  technically  "up"  after  time  X.   But  in  a  network  setting,  it 

does  seem  appropriate  to  assume  that  updates  initiated  at  remote  sites 

are  being  logged  somewhere,  so  that  there  does  exist  an  update  list  to 

be  processed.   In  addition,  we  are  interested  primarily  in  comparing  A 

with  availabilities  computed  for  multi-copy  strategies,  where  the  copies 

are  assumed  to  be  up  to  date. 

Consider  Strategy  1  for  transferring  usage  back  and  forth  between 

master  copy  and  backup  copy.   After  the  master  copy  is  determined  to 

have  failed,  the  remote  copy  is  then  brought  up  (after  a  time  lapse  of 

D  +  L  +  kY)  and  usage  is  transferred  to  it.   Meanwhile  the  old  master  is 


*  The  parameter  k  is  referred  to  in  the  literature  as  a  "compression" 
factor  [Chandy  et  al.,  1975].   This  is  not  to  be  confused  with  the 
usual  data  compression  factor  denoted  by  K  in  the  previous  section. 
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being  repaired.   Queries  and  updates  are  sent  to  the  new  master,  how- 
ever, until  it  fails,  at  which  time  the  process  repeats:   another  "new' 


master  is  identified  and  activated.   (This  may  or  may  not  be  the  "old" 
master.)   This  strategy  is  diagrammed  (but  not  to  scale)  in  Figure  5. 
Since  the  remote  site  may  have  been  up  for  some  time  since  its  last 
failure,  it  is  assumed  that,  after  the  data  base  comes  up,  the  expected 
time  until  failure  is  only  F/2.   (Actually  a  smaller  number  may  be  more 
reasonable,  since  some  host  time  has  already  been  spent  in  the  recovery 
operation.)   Notice  that  an  obvious  built-in  assumption  can  be  read  from 
the  figure. 

(1)  D  +  L  +  kY  <  X  +  kX 
If  this  inequality  is  not  satisifed,  it  theoretically  does  not  pay  to 
store  a  remote  copy,  since  the  master  is  expected  to  be  repaired  and 
updated  before  the  remote  copy  can  be  activated.   The  formula  for  avail- 
ability under  Strategy  1  can  then  be  read  off  Figure  5  as 

A   =  I • 

1    2D  +  2L  +  2kY  +  F 

Strategy  2  is  to  immediately  replace  the  copy  by  the  old 

master  as  soon  as  the  latter  has  been  brought  back  up.   This  scheme  is 

diagrammed  in  Figure  6.   Again,  inequality  (1)  must  hold  in  order  for 

the  diagram  to  be  meaningful.   There  is  an  additional  assumption  which 

must  be  made  in  order  for  our  model  of  either  strategy  to  be  valid. 

This  assumption  is  that  D  +  L  +  kY  is  sufficiently  small  compared  to  F 

that  there  is  little  likelihood  of  a  failure  of  the  remote  host  during 

the  recovery  process.   In  addition,  Strategy  2  requires  that 

(2)   X  +  kX  <  |- 

If  this  is  violated,  there  is  a  good  probability  that  the  copy  may  fail 
before  the  master  is  ready.   For  reasonable  values  of  F,  however, 
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inequality  (2)  is  readily  satisfied;  it  is  inequality  (1)  that  must  be 

carefully  checked  in  using  the  model.   Finally,  the  availability  formula 

can  be  read  from  the  diagram: 

D  +  L  +  kY. 
2       F  +  X  +  kX 

Keeping  inequality  (1)  in  mind  and  comparing  formulas,  we  see  that 

Strategy  1  is  generally  poorer  than  Strategy  2,  and  indeed  A.,  is  often 

less  than  A  .   We  will  therefore  restrict  consideration  to  Strategy  2. 
o 

Sensitivity  to  parameter  values.   In  any  model,  it  is  useful 
to  determine  how  sensitive  the  output  values  are  to  changes  in  the 
inputs.   Obviously,  the  inputs  are  only  known  approximately  or  are 
statistical  averages.   If  the  output  changes  drastically  for  a  small 
change  in  an  input  value,  the  model  is  rather  useless  for  predictive  or 
decision  purposes.   Chandy  et  al.  [1975]  use  the  elasticity  E(f,y), 
essentially  the  "percentage  change  in  f  caused  by  a  percentage  change  in 
y",  to  investigate  the  sensitivity  of  a  function  f  with  respect  to  a 
parameter  y.   Formally,  E  is  defined  by 

We  have  investigated  the  elasticity  of  U  =  1  -  A„  with  respect 

to  all  of  the  input  variables.   (Working  with  U  instead  of  A„  simplifies 

the  algebra  without  changing  the  conclusion.)  We  find  that  for  all 
parameters 

\¥Z\    <  1. 
'3y  U1 

For  example,  taking  y  =  k, 

iH  =  FY  +  XY  -  DX  -  LX^  and 
3k     (F  +  X  +  kX)2 
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|9U.ki  =  I   k(FY  +  XY  -  DX  -  LX)   ■ 
'  3k*u'    I  (F  +  X  +  kX)  (D  +  L  +  kY)  ' 

| kFY  +  kXY  -  .  .  .     I  < 
1  kFY  +  kXY  +  .  .  .     ' 

And  for  y  =  Y, 

I  9U  Y  i       kY       F  +  X  +  kX       kY 


<  1. 


1  9Z  U '    F+X+kX   D+L+kY   D+L+kY 

Similar  computations  show  that  the  elasticities  of  U  with  respect  to  D, 

L,  X,  and  F  are  all  less  than  one.   Elasticities  of  U  are  connected  to 

those  of  A9  through 

3A 
■   2_y  I  =  |_3U.y_  <  |_9U|Z 

1  3y  *A2'    '3y'A2    Uy'u' 

as  long  as  A?  >  U.   We  may  conclude  therefore  that  our  model  is  stable, 
being  relatively  insensitive  to  small  changes  in  parameter  values. 
Experiments  and  Discussion 

Remote  journaling.   In  order  to  model  a  remote  journaling 
process,  we  assume  that  the  parameter  Y  is  large;  for  simplicity  we 
assume  that  it  is  equal  to  F.   Thus  we  are  essentially  assuming  that, 
whenever  the  master  comes  up  after  a  failure,  a  copy  of  the  up-to-date 
data  base  is  shipped  off  to  any  remote  site  which  contains  a  copy  of  the 
data  base.   (Or  that  the  remote  data  base,  having  been  used  as  a  master 
copy  while  the  master  was  down,  already  possesses  an  up-to-date  copy  at 
this  time.) 

It  is  interesting  to  note  that  journaling  remotely  by  shipping 
the  data  base  over  the  network  is  not  feasible  on  a  regular  basis.   For 
example,  consider  a  data  base  of  4  x  10  bytes  (roughly  FORSTAT  size). 
At  a  network  throughput  of  15  kilobits  per  second  (faster  than  normal 
for  the  ARPANET) ,  it  would  take  approximately  6  hours  to  ship  a  data 
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base  of  this  size.   Daily  backup  by,  say,  sending  tapes  by  courier 
would,  however,  be  feasible  in  many  situations. 

The  data  copy  at  the  remote  site  will  be  generally  assumed  to 
be  on  tape.   The  value  L  =  0.5  hr .  has  been  assumed  in  the  computations 
since  it  is  approximately  the  time  to  read  two  to  three  tapes.   The 
parameter  D  is  probably  on  the  order  of  one  or  two  seconds,  but  we  have 
taken  it  to  be  .01  hr.  as  an  absolute  upper  bound.   X  =  1  seems  to  be  a 
reasonable  mean  value  for  repair  time.   With  these  parameters,  we  get 
the  following  formula  for  improvement  I  in  availability  as  a  function  of 
F  and  k. 

A2  "  Ao  _  0.49  +  k(l  -  F) 


A  F 

o 

It  is  difficult  to  estimate  what  a  reasonable  value  of  k  should  be.   In 
a  similar  analysis,  Chandy  et  al.  [1975]  take  k  =  1/8.   Clearly  the 
value  will  depend  on  the  usage  pattern  for  the  data  base;  doubtless  ways 
of  measuring  it  for  a  real  system  could  be  devised.   However,  notice 
that,  with  k  =  1/8,  inequality  (1)  states  that 

.51  +  F/8  <  (1  +  1/8). 
Hence  for  this  large  a  k  the  time  to  process  the  audit  trail  is  so  long 
that  the  master  is  able  to  get  up  before  the  backup  copy  whenever  F  >  4.92 
hrs.,  which  is  an  unreasonably  low  value. 

To  get  a  feel  for  the  value  of  remote  journaling,  we  therefore 
take  k  =  .01;  i.e.,  we  assume  that  there  are  few  updates.   In  this  case 
inequality  (1)  restricts  the  model  to  F  <  50.   A  graph  of  I  vs.  F  in 
this  case  may  be  seen  in  Figure  7.   Notice  that  for  reasonable  values  of 
F  the  improvement  in  availability  is  less  than  5  percent  which  may  not 
be  enough  to  make  remote  iournaling  worthwhile.   Values  of  A  have  also 
been  plotted  in  the  figure  for  reference. 
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Figure   7 

Single-site  availability  A  and  fractional  improvement  I 
through  use  of  Strategy  2.   Parameters  are  k  =  0.01, 
D  =  .01  hr.,  X  =  1  hr.,  L  =  0.5  hr.,  and  Y  =  F. 
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Figure  8 
Same  as  Figure  7,  except  that  Y  =  1  hr, 
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As  a  final  comment  on  the  remote  journaling  strategy  described 
here,  we  note  that  availability  may  actually  decrease  as  F  increases. 
For  example,  suppose  X  =  2,  k=  0.25,  L  =  0.5  and  D  =  0.   Then  A  =  .7692 
for  F  =  A  and  A_  =  .7647  when  F  =  6.   Differentiating  A   (for  Y  =  F) 
with  respect  to  F  shows  that  this  decrease  will  occur  whenever 

k(k  +  1)X  >  D  +  L. 
Intuitively,  this  phenomenon  occurs  because  for  large  k  the  effect  of 
the  lengthening  audit  trail  to  be  processed  outweighs  that  of  the 
increasing  reliability  of  the  host  computer. 

Frequently  updated  remote  journal.   The  lack  of  effectiveness 
of  the  remote  journaling  strategy  described  in  the  last  section  seemed 
to  be  caused  by  the  necessity  of  processing  an  extremely  long  audit 
trail.   Suppose,  then,  that  we  drop  the  assumption  that  Y  =  F  and  assume 
instead  that  the  remote  copy  is  periodically  brought  up  to  date.   As  an 
example,  we  might  assume  this  updating  to  take  place  every  two  hours,  so 
that  the  average  length  of  audit  trail  to  process  to  bring  up  the  remote 
copy  is  1  hour.   With  all  other  parameters  as  specified  for  Figure  7, 
but  with  Y  =  1, 

I  =  .49/F. 
This  result,  which  is  graphed  in  Figure  8  is  independent  of  k  (because  of 
the  cancelling  of  kX  and  kY  terms) ,  as  long  as  k  and  F  are  such  that  the 
model  is  valid.   Unfortunately,  the  improvement  is  still  generally  less 
than  5  percent . 

Indeed,  the  curves  in  Figures  7  and  8  are  nearly  identical. 
To  see  why  this  should  be  so,  consider  more  closely  the  formula  for  I. 

T  =  A2  "  Ao  =  X+kX-D-L-kY 

A  F 

o 
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As  long  as  k  is  small  (or  when  X  =  Y  as  above)  it  is  clear  that 

Running  spares.   Here  we  assume  that  the  backup  copy  is  stored 
on  disk  for  virtually  instantaneous  access  and  is  kept  almost  up  to 
date.   Reasonable  parameters  for  this  case  might  be  L  =  0,  Y  =  .1  hr., 
and  (for  comparison  with  the  results  above)  X  =  1,  k  =  .01.   Then  we 

have 

=  0.999. 

F 

We  will  not  bother  to  graph  this;  this  curve  looks  just  like  the  earlier 

ones,  only  the  values  of  I  are  approximately  doubled.   In  this  case, 

improvements  of  5  to  10  percent  are  seen  for  F  between  10  and  20  - 

certainly  enough  to  make  the  strategy  worthwhile.   In  fact,  what  happens 

in  this  case  is  that,  under  our  assumptions,  availabilities  are  brought 

up  to  very  nearly  unity.   To  see  this,  note  that 

.01  +  kY 
2       F  +  (1  +  k)X 

and  for  our  example  kY  =  0.001.   Increasing  k  will  cause  somewhat  smaller 

values  of  A  ,  but  A  will  be  over  99  percent  for  a  wide  range  of  reasonable 

parameter  values. 

Effect  of  varying  Y.   We  have  looked  at  three  separate  cases 

which  differ  from  one  another  in  large  part  in  the  widely  differing 

values  for  the  parameter  Y.   To  better  understand  the  effect  of  this 

parameter,  we  select  typical  values  of  the  other  parameters  (X  =  1, 

L  =  0.5,  D  =  0.01,  F  =  20)  and  consider  A?  as  a  function  of  Y  for 

several  different  values  of  k.   When  k  =  .01,  we  have 

0.51  +  0.01Y. 
2  ="        21.01 


52 


The  small  coefficient  of  Y  in  this  case  makes  the  effect  of  Y  minimal. 
As  Y  ranges  between  0  and  20,  A?  decreases  linearly  from  0.976  to  0.966, 
Now  suppose  that  k  is  increased  to  0.05.   In  this  case  as  Y  goes  from 
0  to  20,  A  decreases  from  0.976  to  0.953  -  still  not  a  very  dramatic 
change!   To  a  large  extent  what  makes  the  "running-spare"  approach  so 
worthwhile  is  not  the  small  value  of  Y  but  the  instantaneous  access 
(L  :   0). 
Conclusions  and  Plans  for  Future  Work 

We  have  presented  here  a  model  for  data  availability  which, 
while  superficial,  does  seem  to  reflect  the  realities  of  various  strate- 
gies for  backup.   We  have  seen  that  remote  journaling,  in  the  sense  of 
storing  a  copy  in  archival  storage  (e.g.  tape)  at  a  remote  site,  leads 
to  very  little  in  the  way  of  availability  improvement  -  perhaps  5  per- 
cent at  best.   On  the  other  hand,  the  running  spares  strategy,  in  which 
the  remote  copy  is  nearly  up  to  date  and  almost  immediately  accessible, 
brings  availability  up  to  over  99  percent  and  appears  to  be  worthwhile. 
It  should  be  noted,  however,  that  the  running  spares  strategy  is  bound 
to  be  relatively  expensive.   Furthermore,  before  this  strategy  can  be 
effectively  used,  many  of  the  problems  of  multi-copy  management  must  be 
solved.   For  example,  updating  must  be  synchronized  in  order  for  the 
backup  copy  to  be  effectively  kept  up  to  date. 

One  point  to  notice  about  the  model  is  the  importance  of  the 
parameter  k.   We  found  that  remote  journaling  was  theoretically  of  no 
value  unless  k  was  fairly  small.   The  parameter  k  is  essentially  a 
proportionality  factor,  determining  how  long  it  takes  a  processor  to 
"catch  up"  when  there  has  been  a  backlog  of  updates  accumulating.   The 
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value  of  k  will  depend  on  many  factors  -  the  rate  at  which  updates  are 
generated,  the  complexity  of  the  updating  procedure,  the  processor 
speed,  etc.   Some  of  these  factors  and  how  they  enter  into  k  are  amen- 
able to  theoretical  study;  others  require  system  measurement. 

Another  feature  of  the  "catch-up"  time  deserves  some  thought. 
Is  kY  an  adequate  expression  for  this?   Or  should  one  then  add  on  k*kY 
to  take  account  of  the  updates  that  have  been  entered  while  the  first 
set  was  being  processed,  and  so  forth?   Adding  on  these  terms  would  add 
little  complexity  to  the  model;  but  it  seems  hardly  worthwhile  as  long 
as  k  is  so  uncertain.   That  is,  k  as  an  effective  proportionality  con- 
stant can  be  assumed  to  include  the  effects  of  the  higher  order  terms. 

Finally,  further  work  on  this  model  should  include  some  care- 
ful statistical  analysis  of  a  number  of  questions.   What  is  the  proba- 
bility that  a  host  will  fail  during  the  recovery  process?  What  is  the 
probability  that  a  "new"  master  copy  will  fail  before  the  old  one  has 
been  repaired?   (In  both  of  these  cases  more  than  two  copies  would  be 
advantageous.)   How  many  copies  are  needed  to  achieve  a  given  level  of 
availability?   Is  there  some  "optimum"  number  of  copies?   In  short, 
there  are  a  number  of  interesting  questions  which  can  be  addressed  if 
the  parameters  in  the  model  are  considered  to  be  random  variables 
instead  of  simple  average  values. 
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A  Response-Time  Model  for  Distributed  Data 

Introduction 

The  hypothesis  to  be  studied  in  this  section  is  that,  because 
of  disparity  in  site  loads,  sending  queries  to  a  remote  site  may  improve 
response  time,  in  spite  of  network  delays.   (Response  time  we  define 
here  to  be  the  length  of  time  between  when  the  user  inputs  a  query  and 
when  he  receives  the  response.)   We  assume  that  the  data  base  is  equally 
available  (stored  on  disk  and  kept  up  to  date)  at  all  of  the  alternate 
sites.   We  also  will  ignore  such  things  as  the  effect  of  increased 
availability  on  real  response  time.   (That  is,  if  there  is  only  a  single 
copy  and  that  goes  down  for  several  hours,  the  response  time  during  that 
period  is  clearly  very  poor.   But  this  effect  is  hard  to  include  in  our 
model  in  its  present  primitive  state.) 

The  problems  which  arise  in  trying  to  develop  a  model  of  this 
type  are  extremely  difficult.   First  of  all,  the  question  of  how  machine 
"load"  is  to  be  defined  and  measured  has  never  been  satisfactorily 
resolved.   We  are  forced  to  simply  assume  that  there  is  such  a  quantity 
and  that  it  increases  in  proportion  to  the  number  of  jobs  in  the  system. 
Second,  it  is  uncertain  as  to  how  response  time  is  affected  by  system 
load.   The  most  relevant  work  that  we  have  been  able  to  find  in  the 
literature  is  in  Scherr's  monograph  [Scherr,  1967]  on  time-shared 
systems.   Scherr  carried  out  both  theoretical  and  experimental  studies 
of  response  time  as  a  function  of  the  number  of  users  on  the  system.   He 
found  that,  for  a  small  number  of  users,  the  response  time  is  nearly 
constant,  showing  only  a  slow  rise  as  the  number  of  users  increases. 
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At  a  certain  point  (defined  as  the  "saturation"  level)  the  response  curve 
takes  a  sharp  upward  turn,  rising  linearly  with  number  of  users  there- 
after.  This  general  shape  is  pictured  below. 


Response 

Time 


Number  of  Users 


Since  Scherr  assumes  that  the  users  are  keeping  busy,  it  seems  to  be 
a  valid  assumption  that  response  time  will  also  increase  linearly  with 
load,  when  the  load  is  reasonably  heavy.   That  is,  we  assume  that  the 
region  of  this  curve  that  is  pertinent  to  our  study  of  the  advantages 
of  data  distribution  is  the  steep  linear  rise. 
The  Model 

Parameters.   The  parameters  in  the  model  are  as  follows: 
N  =  number  of  computers  in  the  network  which  possess  a  copy  of  the 

given  data  base. 
G.  =  that  part  of  the  load  at  computer  i  which  is  not  related  to 

data  base  use. 
V  =  number  of  updates  per  unit  time. 
H  =  number  of  queries  per  unit  time. 
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v  =  load  induced  by  an  update  rate  V  =  1  (so  that  Vv  is  the  total 

load  due  to  updates) . 

h  =  load  induced  by  a  query  rate  H  =  1  (so  that  Hh  is  the  total 

load  due  to  queries) . 

a, H   =   parameters  describing  the  linear  increase  in  response  time 

as  load  increases.   That  is,  at  a  single  site, 

Response  time  =  a(load  -  £) . 

We  assume  these  parameters  are  the  same  for  all  sites. 

T  =  increase  in  response  time  due  to  network  delays  and  overhead 
n 

of  sending  a  query  to  a  remote  site. 

The  equations.   Suppose,  for  simplicity,  that  all  queries  are 
entered  at  a  single  site  (at  computer  1,  say)  that  possesses  a  copy  of 
the  data  base.   If  the  site  opts  to  respond  to  the  entire  query  load 
itself,  then  its  total  load  is 
Vv  +  Hh  +  G  . 

The  single-site  response  time  R   is  then  given  by 

R  =  a(Vv  +  Hh  +  G,  -  £)  . 
s  1 

Now  if  the  site  decides  to  distribute  the  queries  equally  among  the  N 

sites  which  have  a  copy,  the  load  on  computer  i  is 

Vv  +  Hh/N  +  G. . 

l 

Notice  that  all  sites  are  assumed  to  have  equal  update  loads,  since  all 
sites  have  the  responsibility  of  keeping  their  copies  as  up  to  date  as 
possible.   The  response  time  for  a  query  answered  locally  is  then 
R±   =  a(Vv  +  Hh/N  +  G     -   I) , 

while  the  response  time  for  a  query  answered  at  remote  site  i  is 


where  i  ^  1 . 


R.  =  a(Vv  +  Hh/N  +  G.  -  £)  +  T  , 
i  in 
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The  average  response  time  R  is  then 

R=i(R1  +  E2+  ...  +RN). 
The  quantity  of  interest  is  the  ratio 

R  "  R  ' 
s 

If  R  <  1,  response  time  is  improved  by  distributing  the  queries.   We 

therefore  would  like  to  obtain  some  idea  of  the  conditions  under  which 

R  <  1. 

Use  of  the  Model 

For  simplicity,  consider  the  case  N  =  2.   Then 

R-  +  R_       G0  -  G.  -  Hh  +  T  /a 
R  =  _i 2  =      2    1 n__. 

2R  2(Vv  +  Hh  +  G,  -  £) 

s  1 

The  denominator  of  the  second  term  is  always  positive,  by  our  assumption 

that  loads  are  large  enough  that  response  time  is  described  by  the 

steeply  rising  line.   Therefore  the  sign  of  the  numerator  determines 

whether  R  is  greater  than  or  less  than  one.   That  is,  we  have  the 

result : 

Distribution  of  the  queries  improves  response 

time  if  and  only  if 

aHh  +  a(G.  -  G_)  >  T  . 
12     n 

Now  the  parameter  a  is  the  rate  of  increase  of  response  time  with  respect 
to  load  -  the  slope  of  the  response-time  curve.   Thus  the  left  side  of 
the  above  inequality  is  just  an  increase  in  response  time  due  to  the 
query  load  and  the  load  differential  between  sites  1  and  2.   It  is  intu- 
itively reasonable  that  when  this  quantity  becomes  greater  than  T   (the 
increase  in  response  time  due  to  network  delays  and  overhead) ,  it  pays 
to  distribute.   For  general  N  the  inequality  becomes  hardly  more  complex: 
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Distribution  improves  response  time  if 

and  only  if 

(I)       aHh  +  a(G  -  G)  >  T  , 

where  G  is  the  average  load  at  the  remote  sites; 
i.e.,  G  =  (G2  +  G3  +  ...  +  GN)/(N  -  1). 

An  interesting  point  to  notice  is  that,  if  the  query  load  is  suffi- 
ciently large,  distributing  the  queries  may  improve  response  even  if  the 
local  site  is  less  heavily  loaded  than  the  remote  sites. 

Determination  of  the  parameter  values  to  use  in  this  model 
poses  a  difficult  problem.   As  was  noted  earlier,  the  concept  of  load  is 
not  well  defined.   Values  for  the  G.  are  difficult  to  come  by.   It  may 
be  possible,  however,  to  make  simple  assumptions.   For  example,  one 
could  assume  that  all  sites  are  approximately  equally  loaded.   In  this 
case,  inequality  (I)  becomes 

(I')  aHh  >  T  . 
n 

At  this  point  we  have  quantities  which  undoubtedly  can  be  measured. 

Even  though  we  don't  know  what  "load"  is  and  would  find  it  hard  to 

determine  a  and  h  individually,  the  term  aHh  can  be  determined  as 

follows.   Measure  the  response  time  R(H  )  and  R(H  )  for  two  different 

query  rates  H  and  H  .   Then,  assuming  that  the  system  is  sufficiently 

heavily  loaded  so  that  these  points  fall  on  the  steep  linear  rise  of  the 

response-time  curve  (this  point  can  be  checked  by  further  measurements), 

R(H  )  -  R(H  ) 

ah  : 


Hl  "  H2 

Once  we  have  a  good  estimate  for  ah,  we  can  estimate  aHh  for  arbitrary 
H.   Notice  that  this  same  approach  will  yield  estimates  of  the  left  side 
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of  the  inequality  above  even  if  we  are  not  measuring  a  true  query  rate 

H,  but  only  some  parameter  H?  proportional  to  H.   If  the  network  is 

homogeneous,  T   can  simply  be  measured  by  sending  off  some  queries  and 

comparing  the  response  time  to  that  for  locally  handled  queries.   A  data 

management  system  can  then  automatically  monitor  query  rate  and  response 

times  and  use  inequality  (I1)  to  decide  when  queries  should  be  distributed, 

Generalizations 

Unequal  distribution  of  queries.   Suppose  that  the  queries, 

instead  of  being  divided  equally  among  N  sites,  are  divided  arbitrarily, 

a  fraction  w.  being  handled  by  the  ith  site.   Then 

N 

E  w.  =  1 
1=1  X 

and  the  appropriate  quantity  to  take  for  the  average  load  G  at  the 

remote  sites  is  the  weighted  average 

N 
G  =  I  w.G./U  -  w  ). 
i=2  1  1 

Inequality  (I)  then  becomes 

N   2 
(II)  aHh(l  -  E  w.  )/(l  -  w.)  +  a(Gn  -  G)  >  T  . 

.  -,    i         1       1         n 
i=l 

Once  the  concept  of  distributing  the  query  load  unequally  among  the 
various  sites  is  introduced,  it  becomes  of  interest  to  study  optimi- 
zation of  the  distribution.   What  we  mean  by  optimization  is  the  deter- 
mination of  a  set  of  weights  w- ,  w» ,  ...,  w  such  that  R  is  a  minimum. 
Let  us  consider  how  this  problem  can  be  solved  for  N  =  2.   In  this  case 
w„  =  1  -  W- ,  and  we  can  write  R  in  terms  of  the  single  variable  w  ,  the 
fraction  of  query  load  to  be  handled  locally.   In  detail, 
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R  =  w1R1  +  (1  -  w1)R2 


=  a{w  (Vv  +  w^h  +  G1  -  I) 


+  (1  -  w  )  (Vv  +  (1  -  w  )Hh  +  G2  -  I)} 


+  (1  -  Wl)Tn 

2  2 

aVv  +  aHh(w   +  (1  -  w  )  )  +  w  G^a 


+    (1  -  w.)G0a  -  a£  +  (1  -  w.)T  . 
12  In 


Then 


^-  =   aHh(4Wl  -  2)  +  a(G.  -  G_)  -  T  . 

3w1  1  1     I  n 

If  we  set  this  derivative  equal  to  zero,  we  find  that  there  is  a  pro- 
spective extremum  at 

T   -  a(G,  -  G„)  +  2aHh 
n i I 

Wl  "         4aHh 


w„  = 


a(Gn  -  G.)  -  T  +  2aHh 
1     2     n 


2  4aHh 

Since  the  second  derivative  (4aHh)  is  always  positive,  this  extremum  is 
in  fact  a  minimum,  as  desired.   We  must,  however,  examine  another  con- 
straint -  that  the  weights  w  and  w„  must  both  be  positive.   We  can 
rewrite  w,  and  w„  as 

Tn  -  a(G  -  G  ) 

W,  =  -r-  + 


1   2        4aHh 


w^  = 


1   Tn  "  a(Gl  -  V 


2   2         4aHh 
The  weights  w  and  w„  can  be  seen  to  be  positive  under  a  wide  range  of 
conditions;  for  example,  if  G,  =  G„  and  inequality  (I')  holds. 

Some  interesting  conclusions  can  immediately  be  read  from  the 
equations  for  w  and  w  .   First,  we  note  that  if  the  loads  are  equal 
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(G  =  G  )  the  local  site  should  always  handle  more  than  half  of  the 
queries.   Only  when  T  =  a(G  -  G~)  ,  so  that  the  network  delay  equals 
the  increase  in  response  time  due  to  load  differential,  should  the  query 
loads  be  equalized.   And  only  when  T   is  less  than  a(G..  -  G„)  should  the 
local  site  send  off  more  queries  than  it  keeps. 

It  must  again  be  emphasized  that  careful  measurements  are 
required  for  these  relationships  to  be  useful  for  real  decision  making. 
It  is  easy  to  estimate  that  T  ,  aHh,  and  a(G  -  G  )  may  all,  under 
reasonable  assumptions,  be  on  the  order  of  one  to  two  seconds.   This 
information  is  not  at  all  helpful  for  developing  long-term  strategies, 
but  merely  demonstrates  that  the  optimum  decision  on  query  sharing 
should  be  done  dynamically  and  only  after  monitoring  current  system 
usage  and  response. 

The  above  analysis  for  optimum  distribution  strategy  was 

done  for  the  N  =  2  case.   The  general  case  can  be  handled  similarly, 

but  is  more  complicated  because  of  the  multi-variable  minimization. 

Setting  the  derivatives  to  zero  and  solving  yields  the  following 

equations  for  i  ^  1. 

T  -  a(G  -  G  ) 

I/t     v     x     n 1    r 

w.  =  -^(1  -   S  w.) — — 

i    2     .,.  _  i  4aHh 

Clearly  this  reduces  to  the  simple  formula  found  above  for  N  =  2,  i  =  2. 
But  in  this  case  we  have  a  set  of  simultaneous,  linear,  algebraic  equations 

in  w„,  ...,  wXT  to  solve.   It  is  a  simple  matter  to  show  that  this  set  of 

2        N 

equations  has  a  unique  solution,  readily  obtainable  by  computation,  and 
that  this  solution  does  minimize  R.   Again,  it  is  necessary  to  check 
that  the  weights  w.  that  are  computed  are  all  positive,  in  order  that 
the  solution  be  meaningful. 
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Usage  from  various  sites.   All  of  the  analysis  so  far  has  been 

under  the  assumption  that  the  query  load  all  originates  at  a  single 

site.   Suppose  instead  that  each  site  i  generates  some  fraction  f  of 

the  total  query  load  H.   Site  i  then  distributes  its  query  load  with  a 

strategy  described  by  weights  w(i)  ,  w(i)2>  ...,  w(i>N.   The  net  rate  of 

input  of  queries  that  a  site  i  must  respond  to  is  then  given  by 

H.  =  E  f  .Hw(j)  .  , 
J  3 

so  that  site  i's  response  time  (i.e.  time  to  respond  to  a  query)  is 

R.  =  a(Vv  +  H.h  +  G.  -  £) . 
1  11 

From  the  point  of  view  of  site  j ,  the  average  response  time  seen  is 

computed  as 

R.  =  E  w(j)iRi  +  (1  -  w(j)  )Tn, 

since  a  network  delay  of  T   is  observed  for  the  fraction  of  queries 

n 

answered  remotely.   Now  to  get  an  average  response  time  for  queries 
originated  throughout  the  network,  we  must  take  another  weighted 
average: 

R  =  E  f .R. . 

i   :  J 

Combining  the  preceding  four  equations,  we  get  an  equation  for  R  in 

2 
terms  of  the  N  variables  w(j)..   As  above,  we  can  carry  out  an  optimi- 
zation analysis  or  compare  various  strategies.   (For  example,  the  strategy 
where  each  site  handles  its  own  queries  is  described  by  w(j).  =  1  when 
i  =  j  and  w(j).  =  0  otherwise.)   We  will  not  go  into  further  details  on 
this  generalization  in  this  brief  report. 

Proposed  further  generalizations.   We  list  here  several  other 
ways  in  which  assumptions  may  be  relaxed  and  the  model  made  more  flexible. 
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1)  We  have  assumed  that  T   is  a  constant.   To  be  realistic,  T 

n  n 

should  depend  upon  the  two  sites  between  which  the  messages  (query  and 

response)  are  traveling.   Thus  we  need  to  insert  into  the  model  a  set  of 

values  T  (i,j).   In  addition,  the  values  of  T  (i,j)  may  vary  depending 
n  n 

upon  what  routes  are  taken  -  but  it  is  undoubtedly  adequate  to  take 
average  values.   Finally,  the  values  of  T  (i,j)  will  vary  with  the 
amount  of  network  traffic.   In  particular,  if  we  assume  that  query 
traffic  forms  a  non-negligible  percent  of  net  traffic,  T  (i,j)  will  be 
some  function  of  H.   Theoretical  analysis  (e.g.  by  queueing  theory)  can 
probably  be  used  to  determine  this  function,  which  will  depend  on  network 
parameters  as  well  as  on  H  and  the  distribution  strategy. 

2)  We  have  assumed  that  the  parameters  a  and  £,  which  describe 
response  time  as  a  function  of  "load,"  are  the  same  for  all  sites.   This 
assumption  is  not  true  in  a  heterogeneous  network,  or  for  a  network  of 
dissimilarly  configured  "homogeneous"  hosts  (e.g.  the  PWIN) .   It  should 
be  noted,  however,  that  the  parameter  I   did  not  enter  into  any  of  the 
decision  relations,  except  in  the  assumption  that  "load"  must  be  large 
enough  compared  to  £  so  that  the  linear  expression  for  response  time 
holds.   In  addition,  we  have  seen  that  a  is  measured  only  as  it  occurs 
combined  with  other  factors.   That  is,  differences  in  a  may  be  taken 
into  account  by  varying  the  H..   (See  preceding  section.)   Thus  the 
practical  impact  on  the  model  of  allowing  a  and  Z   to  vary  from  site  to 
site  is  probably  minimal. 

3)  We  have  assumed  that  all  sites  have  an  equal  load  asso- 
ciated with  updating  the  data  base.   This  will  not  in  general  be  true. 
If,  say,  the  updates  all  originate  at  Site  1,  the  other  sites  will  all 
incur  network  overhead  in  processing  the  updates.   On  the  other  hand, 
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Site  1  may  do  much  preprocessing  to  make  the  update  task  simpler  at  the 
remote  sites.   Details  of  this  sort  could  be  built  into  the  model. 
Results  can  not  be  too  different,  however,  since  site  differences  in  the 
terms  Vv  can  always  be  subsumed  in  the  G.. 

4)  We  have  assumed  that  N  sites  in  the  network  have  up-to-date 
copies  of  the  data  base  and  the  problem  is  to  determine  a  strategy  for 
distributing  the  queries  as  they  are  entered  into  the  system.   A  some- 
what different,  but  very  similar,  model  is  needed  to  study  the  problem 
of  setting  a  policy  for  distributing  the  data  base  -  i.e.,  for  deter- 
mining which  sites  should  have  copies.   In  this  case  a  careful  analysis 
of  the  effects  of  updates  will  be  essential.   The  sites  at  which  updates 
originate  will  find  their  loads  increased  by  the  necessity  of  distri- 
buting the  updates  to  other  sites  having  a  copy  of  the  data  base. 
Remote  sites  holding  a  copy  of  the  data  base  will  all  have  increased 
loads  due  to  the  processing  of  updates  and  associated  network  overhead. 
Strategies  for  synchronization  and  aspects  of  multi-copy  management  will 
affect  loads  and  hence  response  times.   These  effects  will,  of  course, 
implicitly  enter  into  the  query  distribution  model,  but  there  we  assumed 
that  response  times  were  obtained  by  direct  system  measurement,  so  that 
lower-level  details  were  not  modeled,  but  included  in  measured  para- 
meters.  To  determine  a  distribution  policy  a  priori  requires  modeling 
these  lower-level  effects  and,  if  possible,  optimizing  the  lower-level 
strategies  (e.g.  for  synchronization)  so  that  the  policy  is  based  on  the 
best  up-to-date  technology. 

5)  The  four  proposed  generalizations  listed  above  involve 
relatively  straightforward  extensions  of  the  approach  described  in  this 
report.   In  addition,  however,  the  approach  itself  needs  investigation. 
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That  is,  we  have  assumed  that  response  time  is  a  simple  linear  function 
of  "load,"  and  that  "load"  can  be  described  as  a  simple  linear  combina- 
tion of  updates,  queries,  etc.   It  should  be  possible  to  examine  these 
assumptions  -  both  by  measurement  and  by  stochastic  queueing  analysis. 
A  careful  examination  of  this  type  is  expected  to  yield  some  refinements 
in  the  relations  studied  here. 
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